This is an archive of past discussions on Wikipedia:Link rot. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current main page.
ieee.org
Most of these (search link) are broken and can be replaced.
Batch 1: Checked 1,000 pages and edited 501 pages. Moved 337 links to a new URL. Added 106 {{dead link}}. Switched 4 |url-status=dead to live. Switched 11 |url-status=live to dead. Added 166 archive URLs (112 Wayback). Changed 1,335 citation metadata fields [bug in program, unsure the actual number]
Batch 2: Checked 7,898 pages and edited 3,943 pages. Moved 2,804 links to a new URL. Removed 3 {{dead link}} templates. Added 575 {{dead link}}. Switched 19 |url-status=dead to live. Switched 78 |url-status=live to dead. Added 1,654 archive URLs (1,454 Wayback). Changed 12,927 citation metadata fields [bug in program, unsure the actual number]
IABot database: Checked ~25,000 links. Modified about 2,500. Changes will propagate to 300+ wikis.
Notrealname1234: Thank you for the notification. They are deleting all pages December 20, 2024. IABot has registered 133 unique URLs across 300+ wikis including jawiki. IABot has been disabled on jwiki since early 2023 and no idea when it will return. Well, I can do this on enwiki, and update the 133 URLs in IABot, which will save them on jawiki whenever it is enabled. They are still live but I'll treat them as dead. Might be a few weeks (above work ahead). -- GreenC17:25, 22 July 2024 (UTC)
Hello. slate.msn.com doesn't work. These have archived redirects and also working redirects. Here are examples:
For Raging Cow, changing this to that by removing msn from the URL redirects to the new link here.
If that doesn't work, I've seen archived redirects. This goes here for Peter Maass. Removing the archive from the URL makes a redirect to this new URL.
Redirects also exist without msn. This link goes here for Amazon Theater. Removing the archived part redirects to the new URL here.
~300 links. URLs such as fray.state.msn.com or cagle.slate.msn.com would need regular archives. These links also include ones not in Articlespace, such as talk pages. Thanks! MrLinkinPark333 (talk) 18:57, 26 July 2024 (UTC)
In the third example, this returns a header status 200 and no redirect information, so curl can't see the redirect. It's being redirected by JavaScript. Hopefully an edge case. -- GreenC23:53, 1 August 2024 (UTC)
Done Checked 103 pages and edited 53 pages. Moved 53 links to a new URL. Removed 1 {{dead link}} templates. Switched 41 |url-status=dead to live. Added 2 archive URLs (2 Wayback). Changed 4 citation metadata fields.
This was a twister if you see anything I missed let me know. search, it might take time for the search cache to reflect the edits. -- GreenC00:55, 2 August 2024 (UTC)
For No Fly List, making the link into here (without fr/ss) works as a redirect to there. For Godzilla 2000, making the link into here works as a redirect to there (by removing default and change id= to /id/). No luck with Albert Gore Sr. George W. Bush at Slate doesn't match the article either, so it could be left archived. MrLinkinPark333 (talk) 01:32, 2 August 2024 (UTC)
OK. If you want to adjust those manually it won't make sense to program and run the bot for these edge cases. -- GreenC01:44, 2 August 2024 (UTC)
@GreenC: The bot now changes perfectly fine refs that were properly waybacked and marked as 'dead'. This is pointless. In fact, I would argue it's worse. See this edit at Pokémon. This is the waybacked page from slate.msn.com. This is the new page from slate.com.
When I wrote the paragraph in question, I purposely chose the waybacked old page, because the new page is filled with ads and has a very annoying floating, picture-in-picture video that automatically starts playing when the page loads.
On a positive note, the ad blocker not only blocks this, but also busts through the "You seem to have an ad blocker" message. So the ad blocker does work here. But not everyone has an ad blocker installed. - Manifestation (talk)09:18, 2 August 2024 (UTC)
I understand. Yeah this is murky territory because if we are using the Wayback Machine to intentionally bypass a website, that otherwise has live content available, it is undermining traffic to the website, and traffic is why websites exist. In response, there is nothing stopping Slate from making a takedown request at Wayback. The entire domain would be taken down, leaving us with no archives even for legitimately dead links (except archive.today who do not honor most take down requests). This is not hypothetical it is happening more frequently. Anyway, I didn't remove the archive URL, and it can be flipped back to dead status, the bot won't reprocess the domain anytime in the foreseeable future. -- GreenC13:35, 2 August 2024 (UTC)
Per The Sydney Morning Herald, they "will no longer produce editorial content for Insider/BI and there will not be a BIAUS website", so I think it's safe to assume these links are not gonna come back to this domain.
Hello again. Msnbc.msn.com links don't work. Some have redirects that work while other's dont. Please note that they redirect to NBC News links. This falls under two categories:
OK it's a soft-redirect -> redirect -> destination: Any URL that contains "/id/", extract the ID and convert to "https://www.msnbc.com/id/{id}/" -- thus http://today.msnbc.msn.com/id/43584191/ns/today-today_people/t/monaco-palace-releases-guest-list-royal-wedding/ converts to https://www.msnbc.com/id/43584191/ .. then follow the redirect to https://www.nbcnews.com/id/wbna43584191 -- GreenC03:56, 2 August 2024 (UTC)
Enwiki:
Checked 3,616 pages and edited 1,288 pages. Converted 1 templates. Moved 725 links to a new URL. Removed 4 {{dead link}} templates. Added 291 {{dead link}}. Switched 661 |url-status=dead to live. Switched 20 |url-status=live to dead. Added 182 archive URLs (132 Wayback). Changed 213 citation metadata fields.
It converted 725 links to nbcnews.com .. however the rest 2,456 were never migrated to NBC so the pages don't exist. For example this goes to that which goes to 404 .. soft-redirect -> redirect -> dead link -- GreenC13:10, 2 August 2024 (UTC)
IABot DB:
About 17,000 links. Updated about 12,500 links which will propagate to 300+ wikis via IABot. -- GreenC01:51, 3 August 2024 (UTC)
Done
nbcnews.com/id
Hello. NBC News links with /id/ in the URL redirect to new links. For example, this goes here for General Electric. However, this not always work:
However, keeping only the id in the URL doesn't always work. Making this into that redirects to a 404 for Legality of euthanasia. The new URL is here and does not match up. I think it would be better to find archived copies for these pages that redirects to 404s as I can't predict the new URL.
Also, at times links will give a "Something Went Wrong" error but still work after refreshing the page. This happened to me after changing this to the new URL for David Yalof.
~7250. Any links with /id/wbna after the above msnbc request above can be ignored as they will be already fixed.
User:MrLinkinPark333, for the "Something Went Wrong", I tried the example and it never loads after repeat refresh. A header check returns "HTTP/1.1 500 Internal Server Error". 500 is a generic error code when no more specific error code is available. I tried with a proxy sock IP (VPN) and it returns 206, which is sort of like saying it's a partial shipment, only one data segment arrived, more typical of large data files or video files. These are weird responses both are rare. The archive version (few days ago) is of a normal news article. I think the conservative solution is treat them as dead for now until NBC works out whatever went wrong. I'll test and see what percentage are like this. -- GreenC02:18, 3 August 2024 (UTC)
24% of the links are "Something Went Wrong". 1,767 out of 7,423 .. the others converted successfully. Retries after hours pause makes no difference. Now the proxy does not work either. I don't have much option but consider them dead links. If this problem lifts in the future it can be reprocessed (note to self: find links in project nbcnewscom.0001-8263 with "grep 'Went Wrong' syslog"). -- GreenC14:55, 3 August 2024 (UTC)
Perhaps more URLS to Wiiley with other paths has died but can be saved if replacing the path (in above example /store/) with /doi/. Mind checking? Jonatan Svensson Glad (talk) 23:29, 31 July 2024 (UTC)
Jonatan Svensson Glad, the site uses CloudFlare bot protection. I can't verify if the new URL is live/dead or redirects. Because there are so few, and this seems like it should work, I'll do a blind move. Worst case, I can change them back to /store/. -- GreenC18:30, 3 August 2024 (UTC)
Checked 111 pages and edited 111 pages. Moved 123 links to a new URL. Removed 9 {{dead link}} templates. Switched 3 |url-status=dead to live. Added 1 archive URLs (1 Wayback).
Jonatan Svensson Glad: On a related note, I spot-checked the edits, and in all cases they were part of citation templates where there was a |doi= parameter that also goes to the same target. Given these |url= point to the content via their DOI, cite-template docs advise against including the URL at all. There are about 16k links to wiley.com/doi URLs and some do not have separate DOI fields, so it would be a harder bot task to fix them. DMacks (talk) 19:10, 3 August 2024 (UTC)
Can Citation bot fix these? I recall it removed URLs when there is a duplicate identifier URL, but it was also controversial in some way, and can't recall how it settled. -- GreenC19:18, 3 August 2024 (UTC)
If there is a PMC link (which is open access) or |doi-access=free, then Citation bot removes the URL to some specific domains but not all, unsure which specific domains though. This since, the title will use the PMC or free DOI ink instead. Jonatan Svensson Glad (talk) 19:39, 4 August 2024 (UTC)
Just doubled checked that magazine example and while it was archived a few times, the magazine doesn't appear to load & just shows a spinning waiting icon. So those might be total dead links if the Internet Archive copies don't work. Sariel Xilo (talk) 18:13, 2 August 2024 (UTC)
I'm assuming every link in the domain is functionally dead. I'm not verifying that assumption, because they use JavaScript redirects, which I can't detect, thus every page appears to be status 200 (live) which is actually a soft-404 to an end-of-life page. If after the bot is done anyone sees a problem with a link still live but marked dead, I can investigate and redo those links. -- GreenC01:06, 4 August 2024 (UTC)
User:Sariel Xilo, I forgot to load IABot's database with archive URLs. I did set the domain status to "permadead" at iabot.org, but IABot can't discover archive.today links which make up a sizeable portion of available archives. Once finished the Highway Administration site below I'll return to this. There are 3,400 unique URLs. -- GreenC20:04, 4 August 2024 (UTC)
Well, their 404 page is misconfigured to return status 200 (live), example. I'll need to download every URL and web scrape for key words. This kind of basic problem with website management portends other more difficult ones. There are 3,000 pages (articles) on Wikipedia with this domain. -- GreenC17:53, 4 August 2024 (UTC)
Enwiki in two batches:
Batch 1: Checked 1,000 pages and edited 738 pages. Moved 718 links to a new URL. Added 3 {{dead link}}. Switched 12 |url-status=dead to live. Switched 11 |url-status=live to dead. Added 196 archive URLs (191 Wayback). Changed 76 citation metadata fields.
Batch 2: Checked 2,000 pages and edited 1,579 pages. Moved 1,582 links to a new URL. Added 6 {{dead link}}. Switched 28 |url-status=dead to live. Switched 19 |url-status=live to dead. Added 483 archive URLs (469 Wayback). Changed 179 citation metadata fields.
IABot DB:
Checked about 2,000 unique URLs and modified about 400 which will propagate to 300+ wikis via IABot.
That was fast, thank you. I checked about 50 affected articles on my watchlist, and all the new ts.fi URLs now work in those articles. However, I noticed one problematic edit, but I believe I found the rest of the erroneous edits, as they all appeared to be URLs with unusual characters (colons, semicolons, question marks, commas): edit #1, #2, #3, #4, #5. I found working URLs for them by checking the edit histories (except for this one that seems to be permanently dead), so everything should be good now. Thanks again. --JAAqqO (talk) 20:52, 5 August 2024 (UTC)
Ah yes those URLs I came across and intentionally re-routed to the home page because they were redirecting there anyway as soft-404s (WP:LINKROT#Glossary) and they looked like errors anyway. These are in fact soft-redirects, which requires foreknowledge or search and discovery to determine the correct destination. -- GreenC23:36, 5 August 2024 (UTC)
cdc.gov
CDC recently overhauled their website. Many links now have this interstitial saying the page has moved while linking to the new one. For example: https://www.cdc.gov/niosh/topics/motorvehicle/ -- in defiance of standards, that URL returns a 404 instead of a 301
Yup, done. For the future, is there any value for posterity in adding posts here for links that only have a smattering of pages or should I just fix 'em? GrapesRock (talk) 16:50, 1 July 2024 (UTC)
It's hard to say because it depends what work is involved making the fix. I've seen cases where 5 pages can take a long time to figure out manually and better done by bot. To setup the bot, compile, generate a list of target pages, run the bot, check for errors, upload diffs .. it's like 10 or 15 minutes for a small run. If you can do it faster than that manually, go for it. But even for simple cases, if it's more than around 20 pages don't hesitate to ask for bot help. -- GreenC18:36, 1 July 2024 (UTC)
Smmsport.com appears to have been usurped by an online gambling operation masquerading as the original site. Some links, such as [1] and [2], appear to still work and are intact with their original content, while others return 404 errors. But anything linked from the home page is fake. --Paul_012 (talk) 11:09, 29 July 2024 (UTC)
They're somewhat insidiously inserted into the first top navigation menu. [3] for example is a link farm advertising gambling sites. --Paul_012 (talk) 15:34, 29 July 2024 (UTC)
Ahh I see. This is somewhat unusual case of WP:USURPSOURCE. Probably we need an edit filter to prevent editors from adding more links they believe are legitimate, but actually insidious spam (ie. MediaWiki_talk:Spam-blacklist#Proposed_additions). And the existing links usurped by WaybackMedic (ie. this URLREQ). As the primary discoverer, can you make the Spam Blacklist request? -- GreenC16:54, 29 July 2024 (UTC)
Thanks. I'm not sure about blacklisting, as their old articles could still be useful references. Also, upon closer look, it seems the situation looks more like a hijacking rather than usurpation? Checking the Wayback Machine, the last good version of the home page was archived on 2023-08-13, before the site went down and showed a domain for sale notice. It came back on 2024-06-15, appearing mostly the same as it last did, but by the next archival on 2024-07-02 the gambling links had been inserted into the navigation menu, and the articles linked from the home page had been altered to show a date of 23 May 2024. --Paul_012 (talk) 14:27, 30 July 2024 (UTC)
The spam blacklist prevents adding new links. Since they appear to have legitimate content, this is a problem editors unknowingly adding new links into Wikipedia, that they found with Google or whatever. It is a classic case of WP:USURPSOURCE. It really needs to be blocked. The old links will be kept and converted to usurped ie. changed to archive URLs, and the source URL no longer hot linked. -- GreenC14:45, 30 July 2024 (UTC)
While, this claims that it's moved to an army website, that website's news archive only goes back to October 24, 2019, a week before the fortblissbugle went offline. Just searching a handful of titles, I can't find anywhere where individual stories are hosted.
The domain is technically usurped (ie. Emporis). Has 6,000 pages. Will fix in three steps: 1. add archive URLs on enwiki, as a normal dead domain. 2. Same with IABot DB. 3. Later, usurpify everything in a WP:JUDI batch. -- GreenC03:50, 16 August 2024 (UTC)
Step 2: IABot DB: Checked 24,000 links. Updated 23,520 links (set permadead and added new archive URLs). Changes will propagate to 300+ wikis via IABot.
I didn't find a good way to makes these live. The one method Notrealname1234 found worked for some of those PDF files ("systems_i_software_globalization_pdf"), not all. However that same method is good for ftp:// links noted in the next section below, because those links are not on the web (FTP protocol with no https access), and for that reason they have no archives available. Converting to https:// will be a big win. -- GreenC20:02, 25 August 2024 (UTC)
Enwiki Checked 724 pages and edited 642 pages. Moved 15 links to a new URL. Added 15 {{dead link}}. Switched 13 |url-status=dead to live. Switched 78 |url-status=live to dead. Added 1,258 archive URLs (1,207 Wayback).
IABot DB - checked aprox 2,000 unique URLs. Changes will propagate to 300+ wikis.
GrapesRock, to call this "done" is not accurate because there is probably more that could be done by searching and evaluation. Nevertheless, I'm going to mark it done for now and move on to other projects. If you discover other rules, I can undue the done tag and keep going. This is as you said initially a messy domain, like water from a stone, the "easy" ones are fixed and what remains is pretty difficult. -- GreenC16:10, 28 August 2024 (UTC)
Enwiki - Checked 1,469 pages and edited 557 pages. Moved 359 links to a new URL. Resolved 112 ghost redirects. Resolved 1 soft-404s. Removed 4 {{dead link}}. Added 24 {{dead link}}. Switched 245 |url-status=dead to live. Switched 20 |url-status=live to dead. Added 198 archive URLs (117 Wayback). Changed 5 citation metadata.