Wikipedia talk:AutoWikiBrowser/Feature requests/Archive 6 Source: en.wikipedia.org/wiki/Wikipedia_talk:AutoWikiBrowser/Feature_requests/Archive_6
This is an archive of past discussions on Wikipedia:AutoWikiBrowser. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.
AWB would show the diff you see during manual editing while the bot timer counts down. This way people can have automated editing, but still eyeball a diff once in a while. AWB obeys maxlag now, so the countdown function needs to make itself useful anyway =) –xenotalk03:35, 8 August 2009 (UTC)
Added in revision
IMHO, this needs to be checkbox optional, and probably with a delay of >= 5 seconds. Sound alright? —Reedy10:58, 8 August 2009 (UTC)
I've noticed that there's a fix that eliminates space between two <ref> tags, so I propose an additional fix to eliminate space between the end of a sentence and the <ref> tag. --bender235 (talk) 14:51, 1 August 2009 (UTC)
If we changed the edit box to be a rich text edit box rather than a plain text one, we could introduce a number of new pieces of functionality, such as:
option to use syntax highlighting in the edit box (similar to wikEd)
option to highlight unbalanced brackets in red background
option to highlight alert items in red background (where relevant)
option to highlight changed typos in some background colour (maybe)
This feature request is for the conversion of the text box to a rich text box, so that the items above could be added separately. We would need to preserve existing functionality:
Should almost be a 1:1 swap to go to the RichText editor.. I'll probably have a play with this for you tonight. —Reedy14:45, 26 May 2009 (UTC)
Ok, so. AWB wasn't using a TextBox directly, it was using a wrapped textbox in the form of "ArticleTextBox". I updated that to be a RichTextBox, and updated the PluginInterface to use "TextBoxBase", which both TextBox and RichTextBox both derive from (hopefully reducing most breakages.. This might need/want changing for an ArticleTextBox in the future, though it probably wont matter too much, unless people are needing more specific things from it). Removed a couple of designer things that caused errors. Updated find to use TextBoxBase aswell. rev 4384 —Reedy15:04, 26 May 2009 (UTC)
I think this is causing the diff's to not play so nicely.. We get a font change when we go to non english characters.. I'll leave the rev commited for the moment (so Rjwilmsi can have a play), it might just require a few tweaks and properties setting —Reedy15:08, 26 May 2009 (UTC)
Rob, just a thought. If you want to revert this, can you let me know, and i'll do it? As the whole revision doesn't have to be reverted, and i'll make a patch to allow easy changing.. —Reedy15:54, 26 May 2009 (UTC)
I won't be doing much/any AWB work for a few days, so hopefully that will leave you time to resolve the problems. Rjwilmsi11:34, 27 May 2009 (UTC)
When you double-click on a paragraph to undo a change in the diff window, the window goes back to the top after undoing. This can be quite inconvenient when undoing multiple paragraphs towards the bottom of a long page. Would it be possible to add an option to return to the old line in the diff window (or as close as possible) after you do a double-click undo? Hope this makes sense.—Chowbok☠16:28, 16 July 2009 (UTC)
Added in revision
I have also noticed that in addition to this if I make a change in addition to those being suggested by AWB, double clicking will eliminate any manual edits I have made. --Kumioko (talk) 16:32, 16 July 2009 (UTC)
Right now, you can select items, right click, and choose "add selected to list from", but only with a limited number of options - for instance, you can't do what links here (all namespaces). Rather than add options one-by-one as each is requested, how about making it so you select items, choose at the top how to make the list, and then it uses that to generate the list? --NE205:21, 17 July 2009 (UTC)
Added in revision
Hmm. Something like this would have to be done if the text box at the top is empty, i suppose... The other option is to keep it the same, and do it much more programmatically. —Reedy22:10, 18 July 2009 (UTC)
Programmatically implemented this for you. All the providers that require an input (ie use the selected as specified pages), are added. rev 5336. No list maker plugins though.. —Reedy18:07, 6 September 2009 (UTC)
Database scanner: ongoing count articles of scanned. I would like to have more awareness of how quickly the scanning is proceeding, or if it is stuck. Some scans take longer than others and if I want to go and do something else, it would be handy to have a prediction of when to come back and check progress. There is a horizontal scale but it is quite a crude indicator. I am not exactly sure what would be best but I might like to see something like '1678 scanned out of 1,500,000' and/or '12 minutes remaining'. Lightmouse (talk) 17:01, 12 February 2009 (UTC)
Added in revision
Time remaining is probably impossible. IIRC when MaxSem made it multi threaded, the bar became a proper indication of % of the way through... Might be wrong though —Reedy17:10, 12 February 2009 (UTC)
I find the progress bar to be accurate in terms of linearity of progress. What we could do is show the actual percentage and elapsed time on top of the coloured bar so that the user could do a rough calculation (10% took 2 minutes, so ~18 minutes left...). Rjwilmsi18:05, 12 February 2009 (UTC)
We can look at predecessor systems. For example, downloading items from the web gives an interface that has a progress bar with text underneath. The text says 'xx minutes remaining - yy of zz MB (qq kB/s)' or something like that. We all know that the time remaining is just an estimate and resolution only needs to be to the nearest minute. Instead of counting MB, I would like it to count articles. Could we have 'xx minutes remaining - yy of zz articles'? Lightmouse (talk) 12:01, 15 February 2009 (UTC)
Pages cant really be done. The Dump doesn't give us a number of pages in the total thing, so would mean going through and counting them all before running. Which is a waste of time and resources. Currently, the progress bar value is setup using "return (double)stream.Position / stream.Length;".... Which we could use to display a textual % complete. A start time can be recorded, and the % in the time of execution, can give an expected time to completion... (execution time is already recorded) —Reedy20:29, 17 February 2009 (UTC)
Actually pages would be lees useful. The dump is front loaded - early articles tend to be larger. RichFarmbrough, 14:51 28 February 2009 (UTC).
Add an option to the options menu for List comparer: Add to list comparer when opens: Always / Never / Ask. Default should be Ask as till now but we can speed things for experienced editors. -- Magioladitis (talk) 18:13, 18 September 2009 (UTC)
Do to Database Scanner the same we do for List splitter and List comparer: Add to list maker when opens: Always / Never / Ask. Default should be Ask as till now but we can speed things for experienced editors. -- Magioladitis (talk) 13:08, 26 October 2009 (UTC)
Add a button in the list comparer under each results column to add the contents to the 'working' list. This would remove the need to save the list you want to use and then loading into the working list. mattbr11:05, 22 February 2009 (UTC)
rev 4034... Leaves them in the original box, just adds them to the first listmaker too. —Reedy
I've just used this in 4.5.2.0 and the button adds the list to the first list in the list comparer rather than the 'Make list' list that's used to work through articles (the 'list box' as shown on File:025 AWB illustrations for AWB manual.png) which is what I meant. Sorry for any confusion, please can this be updated? mattbr09:29, 3 May 2009 (UTC)
I did try to get it to do that, but it was being a pain. It should be doable (in the same way that you can "copy" the list maker to the DBScanner.. —Reedy13:31, 3 May 2009 (UTC)
Right click, remove, 2nd option down is All. Should also have Selected in that menu. Ctrl + A then Del would do the same also —Reedy15:24, 4 February 2009 (UTC)
This option is not in the pull-down menu, where I was searching for it. Ctrl+A and tehn Del is not a good idea when working with 25,000 articles. It's slow. Can you add the Remove all on the pull-down menu? -- Magioladitis (talk) 15:57, 4 February 2009 (UTC)
Seems a little pointless/redundant to me.. As those are used to affect stuff being added to the list (ie remove dupes on adding, etc etc) Although, no reason why it cant go on —Reedy14:49, 11 February 2009 (UTC)
Maybe it's just me but I am used in working with this menu instead of right-clicking. I ve been using Ctrl+A and Del and this was really slow. I am ok if you implement it or not. -- Magioladitis (talk) 14:57, 11 February 2009 (UTC)
It isn't just you. Items in a contextual menu (i.e. right-click) should not be the only method of accessing a feature. Some expert users like contextual menus. Developers (in all applications, not just AWB) are experts and that is why they focus on contextual menus and sometimes overlook the need to ensure access via basic menus. As your report suggests, it is not just a problem for novices, some experts have problems too. That is partly why I suggested a review of contextual menus throughout AWB. This issue is known and mentioned explicitly or implicitly in usability guidelines. In fact, Microsoft guidelines actually use the term 'redundant' as a good thing. I hope that helps. Lightmouse (talk) 10:16, 12 February 2009 (UTC)
KingbotK already does that. Counting pages newly created is sometimes important to me. A counter could be added or on the right side bottom or in the toolbar. -- Magioladitis (talk) 13:41, 19 July 2009 (UTC)
AWB should take advantage of the entire available length of edit summaries, like the Gadget available in preferences. –xenotalk14:09, 11 May 2009 (UTC)
Added in revision
AWB uses the 255 chars (or worked out in bytes)... Are we allowed more or something? I know there is the "Allow up to 50 more characters in each of your edit summaries. Works in Internet Explorer, Firefox, and Opera."... —Reedy17:03, 11 May 2009 (UTC)
Well, with Python I was able to use the following edit summary [1]: [[User:Xenobot/6|Bot]] Removing statement "Based on...French Wikipedia" (somewhat inaccurate) & IGN ref, refining INSEE ref, all per [[WP:FRCOM]] request ([[User talk:Xeno|report errors?]], however with AWB, it was truncated at "request (", leaving off the next 34 characters. –xenotalk17:12, 11 May 2009 (UTC)
To more efficiently use the limited edit summary space, replace the "using AWB" link, currently Project:AutoWikiBrowser, with WP:AWB. This would save 17 valuable characters, allowing, for example, more typo fixes to be included in the summary, which would sometimes be helpful. This is quite possibly the easiest-to-implement feature request ever. MANdARAX • XAЯAbИAM08:04, 23 July 2009 (UTC)
When the page parameter of a cite template contains pp. or p., it is removed, because cite templates add them too. I think this fix should apply even when the dot is not present, i.e. remove p or pp. I think that with pp there is no doubt it should be removed, with p, I'm not completely sure. (Can a page be named e.g. “p1”?) Svick (talk) 13:52, 29 November 2009 (UTC)
Auto tag adds a date parameter to {{fact}}, but not to {{cn}} or {{citation needed}}. I think this should be made consistent by adding it to all redirects of {{ciation needed}}. This behaviour is as of version 5673. Svick (talk) 18:37, 29 November 2009 (UTC)
The templates {{idwc}} and {{idw-commons}} have been merged and redirected into {{fdw-iw}} (also accessible through the redirect at {{fdw-commons}}). The new template is backwards compatible with the old ones, but there are some additional parameters that might be useful. Hopefully this change won't cause any issues with AWB. I noticed that Wikipedia:AutoWikiBrowser/User talk templates includes the old templates but not the replacement one (though it's been around for a while). I'm not sure what needs to be done with AWB to make the switch to supporting the updated templates. —Willscrlt ( “Talk” ) 11:58, 17 November 2009 (UTC)
Add an option to preferences to automatically empty article list when the project changes. At most of the cases the list is useless in other projects. Not all editors are benefit of the use of the same list to multiple projects. -- Magioladitis (talk) 01:09, 13 December 2009 (UTC)
No reason to be asked if you want to close AWB before a new version is found.
Today;s bevahiour: User asks updater to check for updates -> Updater asks editor to close AWB and checks wheter are updates or not. Expected behaviour would be: User checks for updates -> If program is up to date, a message informs that there is nothing to update. If program is old, updater asks user to close AWB before proceeding. Moreover, no reason to have 3 options in the Ask window: Yes, No, Cancel. The first 2 can do the job.Done in 5779 -- Magioladitis (talk) 09:59, 15 December 2009 (UTC)
The SVN code is set up to do that (it's undergone various refactorings..) I dont know what version thats on vs whats current out.. —Reedy12:24, 15 December 2009 (UTC)
Help->Check for updates->"AWB needs to be closed. To do this now, click 'yes'. If you need to save your settings, do this now, the updater will not complete until AWB is closed." -- Magioladitis (talk) 12:30, 15 December 2009 (UTC)
AWB uses the terms ignore and skip implying the same meaning. I suggest that AWB should use only one out of the two, for the sake of uniformity. --Siddhant (talk) 16:48, 9 December 2009 (UTC)
Skip is used when the article is skipped and ignore if something is not considered. Fr example: "Skip if page is a redirect", "Ignore templates" Where is the mix?
The big "Ignore" key has as text "Skip this page without saving...". Anyway, if people think Ignore is confusing we can change it to Skip. -- Magioladitis (talk) 13:04, 15 December 2009 (UTC)
I think "Skip" makes more sense because you didn't really "Ignore" the page, you looked at it, and decided to explicitly skip it. ("Skipped by user") –xenotalk14:08, 15 December 2009 (UTC)
Also, to save even more space, there's no need to link to categories in these edit summaries, especially since no categories are being explicitly added. For example, where it says "added wikify tag", instead of "[[:Category:Articles that need to be wikified|wikify]]", just use "wikify". Note that the category linked to isn't even the specific one which is added by the tag (it's actually, for example, Category:Articles that need to be wikified from November 2009). MANdARAX • XAЯAbИAM14:03, 29 November 2009 (UTC)
Provide that a double click on the red error message "Unbalanced brakcets" take you to the position of the unbalanced bracket. Likewise for "dead links" too. WilliamKF (talk) 23:39, 1 November 2009 (UTC)
Added in revision
There's already an item on the options menu to highlight the unbalanced bracket errors, and focus will scroll there in the edit box. Rjwilmsi13:50, 2 November 2009 (UTC)
"More... -> Skip if no cat changed" and "More... -> Remove sortkey" lack tooltip texts. Add something like "Automatically skips articles if no category changed" and "Remove sortkey from category" -- Magioladitis (talk) 00:49, 28 December 2009 (UTC)
Well I did notice that the plugin will bypass the redirect when it adds tags, so this will deprecate the redirects over time. –xenotalk19:44, 8 December 2009 (UTC)
Why's it necessary to delete these? I don't see how they're hurting anything, and some people might be in the habit of using some of the redirected names. (Some of those people fight figure out what's going on when they make a new page and just get a redlink, but other people might be confused by it.) rʨanaɢtalk/contribs20:09, 8 December 2009 (UTC)
I'm re-creating them, since they're valid possibilities that someone who doesn't know the exact template name might type. --NE210:00, 9 December 2009 (UTC)
In addition to the general "Skip if no replacement is made", add to the Find&Replace box an "Minor" option and the option to skip if no major replacement is made". AWB skips if only minor replacements are made. -- Magioladitis (talk) 13:19, 27 November 2009 (UTC)
Added in revision
Replace existing option with:
Skip if ☑ no replacement ☐ no major replacement is made.
I think that it would be useful to have a function where you can save your "Find and replace" options as general fixes. That could be useful if you want to have a side task to whatever you are doing (e.g. if you want to replace Image --> File, without putting strain on the servers) and have the option to skip those general fixes (just like the ordinary automatic changes). This feature could be hard to fulfill, but I don't think it's impossible. /Poxnar(talk)12:40, 2 June 2009 (UTC)
Added in revision
I suppose it wouldnt be hard to convert to a Custom Module, and you could easily add skip otions... —Reedy12:45, 2 June 2009 (UTC)
We can now mark find & replace entries as minor fixes and skip for only minor f&r changes. I think that covers this request. Rjwilmsi01:01, 1 January 2010 (UTC)
I would like an option to force to send stats to the server. Sometimes when I switch account I would like to start from zero again to have better control of my actions. -- Magioladitis (talk)
Currently the only way to logoff an account in AWB is to close the entire application. It would be nice if there was a feature in AWB to logoff rather than either close the app or leave the account logged in. --Kumioko (talk) 18:15, 20 May 2011 (UTC)
Alphabetical sort can be done by choice in more than one place. However, in the list comparer, alphabetical sort is not a choice, it is compulsory. Please can it be made into an option? Lightmouse (talk) 22:12, 5 September 2010 (UTC)
Thanks. I've met perl lovers before, there seems to be a lot of them. I could be one of them if I knew how to get started. Where would I put that script to run it? Lightmouse (talk) 22:03, 6 September 2010 (UTC)
It's inconsistent. Lists are unsorted by default with a simple checkbox option to sort. I like the way you've done that but can't put a strong case for it. The feature either adds value and should exist throughout, or has no value and should be removed throughout. If it's difficult to implement consistently, then it's a judgement call for you guys. Not a big deal, feel free to decline. Lightmouse (talk) 10:15, 23 May 2011 (UTC)
I've been thinking of that today. If we implement the non-alphabetised version it would mean that we would have to check every element of the one list with every element the other and that's slooooooow. I guess if someone really needs this can do it alone by saving both lists and using external sort programs. -- Magioladitis (talk) 10:29, 23 May 2011 (UTC)
You can copy and sort the lists, do the comparison, then link the result back to the original lists. Algorithmically, that's easy (and for each list). I don't know exactly what the result of the comparison should be, though (a list of elements in list a, but not list b, and vice versa?). --Stephan Schulz (talk) 10:45, 23 May 2011 (UTC)
Intersection and symmetric difference found in Filters should give unosorted lists but are buggy. My idea is the following at the moment:
Compare lists: Gives A-B, B-A and (intersection) all alphabetised. Method works with duplicates.
At El Culpable Soy Yo the general fixer wants to remove the blank lines from within the {{tracklist}} template, but in this case the author(s) have carefully grouped the lines to make them easier to understand. Is it worth coding an exception to the usual rule? "If the line after the blank sets a template parameter ending in a number, and the line before the blank sets a template parameter that ends in a number one less than this, then allow the blank line to remain" -- John of Reading (talk) 16:05, 23 December 2010 (UTC)
For some reason, the Fix all excess whitespace feature adds whitespace rather than removing it. It adds a space after a bullet or list number. There is no consensus that there should be a space there but even so, it definitely should not be added by a tool that is meant to reduce whitespace. McLerristarr | Mclay104:08, 7 May 2011 (UTC)
The delay timer has a field that allows values in excess of three years. However, any value larger than 119 seconds is no longer treated as a delay, it simply stops AWB from working. I understand that this is a problem with the API but I'm sure we can improve the interface. My preferred solutions would be:
Increase the value beyond two minutes.
Change the interface so invalid values cannot be set by the user.
Change the interface so it warns the user if the value is invalid.
As well, a minor request. When AWB appears in the edit summary, could it be an interwiki link so there aren't massive numbers of redlinks when AWB is used on other projects. Either that, or have it not linked unless its on WP.
Hmm. theres 2 things to take care of, the project differences, and the language differences. I think having it linked wherever, would be the best... Just what if there is the local page... Hmm Reedy Boy16:02, 3 October 2007 (UTC)
I am marking this as partial since there was some progress after 2 years on this one. Plus, many local pages for AWB now exist. -- Magioladitis (talk) 08:08, 14 August 2009 (UTC)
For example I would prefer if in Greek Wikipedia (el.wiki) the message was "με τη χρήση AWB". The word "using" should be customised. -- Magioladitis (talk) 15:40, 27 July 2009 (UTC)
Added in revision
Unless i've completely lost it, this can be done very easily, as the code is in place, it just requires users notifying us that they want it changing. —Reedy20:50, 27 July 2009 (UTC)
An alternative is to provide a UI so the user can specify the string (checking for the cases where 'using AWB' is going to be used, that is). This might be simpler than tyring to collect all the appropriate strings for every language. ClickRick (talk) 22:31, 27 July 2009 (UTC)
I have a few more suggestions of edits that could be added to AWB based on some things that I have seen.
Some templates actually say template:within the template and I recommend this be removed as a general edit
I have seen quite a lot of articles with accessedate= within a reference instead of accessdate and I recommend that be added.
I have seen a lot of articles with date accessed= within a reference instead of accessdate and I recommend that be added.
Some tables use the dts template with a parameter of like=off. Since the date linking was removed from the logic of this template this perameter is deprecated and is no longer needed.
Not sure if this can be done but it would be useful to add functionality to identify and recommend changes to some types of prose that is against the MOS. Here are a few examples:
1) "also sometimes" should be changed to "sometimes"
2) "surplus left over"should be changed to "surplus"
3) "tried and true" to "reliable"
4) "unresolved problem" to "problem"
5) "resulting effect" to "effect"
6) "repeat the same" to repeat
7) "month period" to "months"
8) "in a nutshell" to "in short"
9) "during the year" to "during" There are lots more but I will leave it there for now.
Number 5 should go to the RegexTypoFix page. As that is what that sorta thing is for (not just actual typos). I thought AWB did remove the Template: from Templates.... —Reedy19:48, 22 June 2009 (UTC)
On number 5 do you want me to put the whole list of the ones I have? in regards to the template: thing maybe I have an old version and the new version will fix that when it comes out. --Kumioko (talk) 19:54, 22 June 2009 (UTC)
Thank you, just to verify did you use "accessed" or "date accessed". I should have been more clear but you should be looking for "date accessed" and changing that not just "accessed"--Kumioko (talk) 17:02, 23 June 2009 (UTC)
For number 1 it appears that AWB has been doing this within FixSyntax for at least two years (SyntaxRegexTemplate regex). Do you have examples of where it didn't? Rjwilmsi19:16, 26 June 2009 (UTC)
No because when I noticed it wasn't doing it I made a manual edit in my AWB. I will deactivate that and see if it comes up again and let you know.--Kumioko (talk) 19:38, 26 June 2009 (UTC)
The existing category and stub features (Guess birth/death dates and Ctrl-T) are great but some enhancements would help even more.
When searching for the birth and death dates, ignore any dated cleanup/wikify etc templates at the top of the article
If there's a DEFAULTSORT, don't append the name key to the generated xxxx births and xxxx deaths categories
Provide a selection list of commonly used categories, such as Living people, Date of birth missing, Year of birth missing, Date of birth missing (living people), Year of death missing
On Ctrl-T, allow selection from a list of recently used categories
Allow selection from a list of recently used stub templates too
How about a 'categorise human' button that would add a human name DEFAULTSORT, guess birth and death dates, add 'Living people' category, if appropriate, or one or more of the 'date missing' categories mentioned above, and convert a {{stub}} template, if present, to a {{bio-stub}}.
Can we also include change so that {{Lifetime}} is included after the categories? As per the Usage guidelines for Lifetime template, it needs to come last, but I believe AWB moves the Lifetime template before the categories. VasuVR (talk, contribs) 15:36, 25 February 2009 (UTC)
Hm... I am not sure that the "usage guidelines" are correct! Since Lifetime includes defaulsort it should be in the same place defaultsort exists. The only reason to put it at the bottom is to override any usage of defaultsort! -- Magioladitis (talk) 21:58, 25 February 2009 (UTC)
I do not agree. Default sort does not affect the content or display of the page, I guess. It only affects the listing within a category - hence DEFAULTSORT can be anywhere. So, Lifetime's position need not be affected by the fact that it includes DEFAULTSORT. Also, the Categories for Living people, year of birth, year of death, are not more important than other categories. Hence they should come at the end among list of categories. VasuVR (talk, contribs) 05:25, 28 February 2009 (UTC)
rev 4460 AWB will now automatically add 'XXXX births/deaths', 'living people' category etc. where the date is available in the article (either following name in bold at top, or within {{birth date}} template or similar). There's a skip option and database scanner option for this logic. Rjwilmsi16:58, 6 June 2009 (UTC)
I think we can close that. "Date of birth missing" and "Date of birth missing (living people)" should not be in article namespace. There are intended for talk pages only. The rest are already implemented. -- Magioladitis (talk) 23:23, 21 July 2009 (UTC)
I'd like to be able to follow redirects using the list comparer. I have a specific problem which may translate to other situations. I have a list of names of federal judges from the Federal Judicial Center, and I am trying to find out which names are missing from Wikipedia. Comparing my list to the names in Wikipedia's categories covering these people is unhelpful because in many cases the Wikipedia article is at a different variation of the name, with the FJC version redirecting to it. If the list comparer could follow the redirects from the names in the FJC list to the names in the federal judge categories, I could get a much more accurate list of which names on the FJC list are missing from Wikipedia. bd2412T23:16, 10 February 2009 (UTC)
Added in revision
If this is done, it should be optional, since there are other cases where redirects will be categorized. --NE201:28, 11 February 2009 (UTC)
Try this - it's a bit complicated but it should work. Turn off all general fixes, and set it to skip if it has the text Category:Foo, and to skip if no changes are made. Load the list and set it to follow redirects. Then run through, and after it's done (there won't be anything to save) look at the "skipped" box in the logs tab. Filter to exclude the skip reason "no change" and you have your list. --NE208:15, 11 February 2009 (UTC)
This couldn't be done by a bot. The page has to be checked before it is deleted to make sure the blue link isn't remove if the page is going to be deleted in 15 minutes. This is why semi-automatic would be great. Acebulf (talk) 19:39, 20 May 2009 (UTC)
I do it the following way: Load page in AWB with links on page (no red links). Save new list. Reload initial list and exclude articles found in new list. Save result and replace initial page with it.
In the case we implement this the name should be: "Links on page (only red links)" and rename the "... (no red links)" to "Links on page (only blue links)". -- Magioladitis (talk) 22:15, 27 December 2009 (UTC)
Per discussion with Reedy, I would like to make the following suggestion. Options menu obviously needs cleanup. Many staff there don't change during a single editing session and mainly are part of editors' style. So I suggest that a tab is create din the preferences form under the title "Editing style" or something and move there:
"Preview the diff in bot mode" could probably slide in there too, but contra Mag, I would like "Auto save settings" to remain quickly accessible. –xenotalk18:22, 13 October 2009 (UTC)
Have been getting these 504: Gateway Timeout's on en.wiki more and more these days, I would like to see AWB pause for 5 minutes and try again, and maybe only terminate operation after 3 repeated attempts that led to gateway timeouts. –xenotalk14:06, 15 December 2009 (UTC)
There are wiki guidelines about the position of citations and ref within text with particular detail with punctuation.[2] It is quite common to see the wrong sort of formatting: [3][4] often there is a space between the punctuation (usually a full stop, comma, colon, or semicolon, [5] but could be after "quotation marks"[6], a question mark, or round brackets) [7], and the full stop or comma is put after the reference [8]. Sometimes there is a comma or full stop before and after reference. [9]. Sometimes there are too many spaces both before and after the reference, [10] or no spaces.[11]Sometimes, they are in the middle of the line when it is difficult[12] to known where they should go, if there is a lot of punctuation on that line. I expect that there are some other common errors too. I have just worked through the page on "Alexander Graham Bell" [13]; I corrected dozens of these mistakes manually, which were not fixed by AWB.[14]
I have not worked on it, but is sounds easy I have been told by a programmer - try using the octal forms of brackets and backslashes in the reg ex. Probably need to first recognise if the format is correct or not, and then only put the wrong ones through a subroutine to save doing too many loops. Snowman13:32, 30 September 2007 (UTC)
It should be fairly easy (just needs someone with the time to sit down and play with it) - Set of regex's to match the bad ones, then something to find the nearest/next full stop, and then just move the reference to there.. Reedy Boy17:37, 30 September 2007 (UTC)
As I have suggested the refs in the middle of text and not adjacent to punctuation would be difficult to reposition because the punctuation might need sorting out, and it might not be satisfactory to move them to the next punctuation, where the refs might look like they are referring to the wrong facts. At the present time I was thinking that these would be left where they were in the middle of the sentence. It is where the spacing is wrong adjacent to punctuation that could be quite easily fixed with reg ex. Spaces could be swapped out/in and/or punctuation moved. The case of more than one ref at a punctuation also needs to be considered. It can be tested in the above block of text although all variations are not included. Snowman18:15, 30 September 2007 (UTC)
Yes, that would be helpful; but not just for full stops but for all punctuation, brackets, and quotation possibilities as well, and refs where the punctuation is included before the end of italics and bold text. Perhaps, start with punctuation marks and obvious ref positions points to get it launched with a success. I think that the diff screen needs to show changes in blank spaces more clearly to show what has been done - that is another suggestion. Have you seen the diff display in Winmerge software, also on sourceforge? Snowman18:45, 30 September 2007 (UTC)
That would save us having to come up with ourselves for the AWB project. And would mean they could be added fairly easily to AWB for the next release. If you wouldnt mind, we'll certainly use them. And give you credit in the code ;) —ReedyBoy20:27, 14 November 2007 (UTC)
Here are my rules, they must be applied in the order given:
Rule: Move reference to after punctuation (1)
Replace: (<ref>|<ref )([^<]*)(</ref>|/>)([\.,;:"])
With: $4$1$2$3
Rule: Delete white-space before reference (1)
Replace: \s(<ref>|<ref )([^<]*)(</ref>|/>)
With: $1$2$3
Apply: Twice
Rule: Delete white-space between references (1)
Replace: (<ref>|<ref )([^<]*)(</ref>|/>)\s(<ref>|</ref )([^<]*)(</ref>|/>)
With: $1$2$3$4$5$6
Rule: Delete white-space before punctuation followed by reference (1)
Replace: \s([\.,;:"])(<ref>|<ref )([^<]*)(</ref>|/>)
With: $1$2$3$4
Rule: Delete white-space before punctuation followed by reference (1)
Replace: \s([\.,;:"])(<ref>|<ref )([^<]*)(</ref>|/>)
With: $1$2$3$4
Rule: Move reference to after punctuation (1)
Replace: (<ref>|<ref )([^<]*)(</ref>|/>)([\.,;:"])
With: $4<!--delspacex-->$1$2$3
Rule: Delete white-space before reference (1)
Replace: \s(<ref>|<ref )([^<]*)(</ref>|/>)
With: $1$2$3
Apply: Twice
Rule: Delete white-space between references (1)
Replace: (<ref>|<ref )([^<]*)(</ref>|/>)\s(<ref>|</ref )([^<]*)(</ref>|/>)
With: $1$2$3$4$5$6
Rule: Delete white-space before punctuation followed by reference (1)
Replace: \s([\.,;:"])(<ref>|<ref )([^<]*)(</ref>|/>)
With: $1$2$3$4
Rule: Add space after reference followed by text.
Replace: (</ref>|[^b][^r]\s/>)([A-Za-z0-9])
With: $1<!--insspace1--> $2
I find this set to be reasonably reliable and effective, would like to hear how others get on.
The white space diff is reported to be ready in the next version, which will help to show what the above (or similar) has done in the AWB diff sceen. Snowman (talk) 00:46, 30 November 2007 (UTC)
Only the first rule seems to work for me. Also, it seems when only using the first rule, it would be useful to let it work repeatedly (when more references are present to move the punctuation just in front of the very first one in the whole row) - however when I set to repeat it, it did not seem to take effect. I also do not understand why you duplicate some rules and why you say "apply twice" on some - isnt it the same?--Kozuch (talk) 20:39, 17 May 2008 (UTC)
Common issues of this set are:
double punctuation not removed (example: .<ref>abc</ref>. like here)
works within comments too - unwanted actually (there are also often hints how to use the ref tags in the References sections)--Kozuch (talk) 09:16, 22 June 2008 (UTC)
I implemented the above routines, but as Kozuch pointed out there problems with them and the mess around with the white space too much. So I've created them with focus on only moving things around when needing too and setting a fixed amount of whitespace after a reference.
foriinrange(0,10):# Move punctuation left and if any space move rightnew_text=re.sub(r'(?<=[\w")>])( *)(<ref [^<>]*/> *|<ref[^</>]+>.*?</ref> *)([\.,;:"])',r'\3\2\1',new_text)# Move space to the right, new_text=re.sub(r'(?<=[\.,;:>])( +| *\n)(<ref [^<>]*/>|<ref[^</>]+>.*?</ref>)(?= *\S)',r'\2\1',new_text)# Add two space if none, reduce to two if morenew_text=re.sub(r'(</ref>|<ref [^<]*/>)( {3,}|)(\w)',r'\1 \3',new_text)
Looks interesting. Max and myself are planning on getting the next version out this weekend, so will look at this afterwards! =) —Reedy23:36, 12 September 2008 (UTC)
Basically the loop are used to move the space and punctuation one by one to each side of multiple refs. I've updated the example to what's being used on the server. Now there's a bug, since it only moves spaces when there's text on the line to the right it will leave spaces before the last ref e.g. "end of text.[3][4] [5]". So it probably a good idea run "Delete white-space between references " after doing this. There might be another bug if there's a ref starting on each line... — Dispenser14:21, 15 September 2008 (UTC)
Yes, that's correct. However I've been running it for some time and got the following corner cases. I'll try and remove the loop hack. Bellow are some issues I found while using the code. — Dispenser03:24, 23 November 2008 (UTC)
Done the following code should work well enough and test case for AWB's general fixes. I think the matching named groups is not implemented in C#. — Dispenser18:35, 23 November 2008 (UTC)
# Only apply if punctuation in front is the dominate formatiflen(re.findall(r'[.,;:][ ]*\s?<ref',text))>len(re.findall(r'(?:</ref>|<ref [^>]+/>)[ ]*\s?[.,;:]',text)):# Move punctuation left and space right but before \ntext=re.sub(r'(?s)(?<=[\w")\]])([ ]*)(\s?(?:<ref [^>]+?/>[ ]*\s??|<ref[^>]*?>[^>]*?</ref>[ ]*\s??)+)(\n?)([.,;:])(?![.,;:])(\s??)( *)',r'\4\2\1\6\5\3',text)# Move space to the right, if there's text to the righttext=re.sub(r'(?s)(?<=[.,;:"])([ ]+)((?:<ref [^>]+?/>[ ]*\s??|<ref[^>]*?>[^>]*?</ref> *\s??)+)(?=[\w\(\[])',r'\2\1',text)# Remove duplicate punctuationtext=re.sub(r'(?s)(?P<punc>[.,;:])(["]?(?:<ref [^>]+?/>[ ]*\s?|<ref[^>]*?>[^>]*?</ref>[ ]*\s?)+)(?P=punc)(?![.,;:])',r'\1\2',text)# Remove spaces between referencestext=re.sub(r'(</ref>|<ref [^>]+?/>)[ ]+(<ref)',r'\1\2',text)# Add two space if none, reduce to two if more# trim or add white space after <ref />text=re.sub(r'(</ref>|<ref [^>]+?/>)()((\'{2,5}|)[\w"(\[])',r'\1 \3',text)text=re.sub(r'(</ref>|<ref [^>]+?/>)([ ]{3,})([\w(\[])',r'\1 \3',text)
Is this implemented? And if so how do I activate it? I've been trying to add them as Find/Replace rules, but have had little success thusfar (though tis my first day). - RoyBoy21:07, 31 May 2009 (UTC)
I resorted to adding the individual regexp's in Find/Replace (normal), but there has to be a easier way to have the multiple expressions together in one Rule, instead of individual lines. I also added the regexp: ([\.,;:"])([\.,;:"])(<ref>|<ref ), replace with: $2$3, which removes repeated punctuation. - RoyBoy21:45, 31 May 2009 (UTC)
I've reinstalled recently, and it keeps ignoring Spaceship_Earth_(Epcot). I have general fixes selected, but it does not pick up the ref spacing issues. What do I have to change? - RoyBoy05:21, 25 December 2009 (UTC) Using built 5739.
Says "Only general fix changes". I notice it doesn't do punctuation changes, so I would still have to use the regex anyway for that? - RoyBoy16:53, 28 December 2009 (UTC)
Rereading the initial request, I think we did all that could be possibly done. The rest could not be part of genfixes since WP:REFPUNCT allows different puncation styles. I ll archive this soon if there are no disagreements. -- Magioladitis (talk) 00:38, 9 January 2010 (UTC)
Not implemented
Fact → citation needed only when other changes are made
This edit was pretty pointless. If AWB would fix template redirects on an article but make no other changes, might I suggest that it just skip the page? That way there won't be any useless edits like this. At least an option for this would be nice. Thanks! –Drilnoth (T • C • L) 02:01, 8 September 2009 (UTC)
I kinda agree, on the other hand the change in clean-up template name in this instance was done to clarify meaning, not for the ususal readability or combining reasons. RichFarmbrough, 15:19, 12 September 2009 (UTC).
This isn't really a bug pers sey so I added it here but AWB does not seem to work in the beta version (and potentially future version) of WP.--Kumioko (talk) 14:44, 7 August 2009 (UTC)
Added in revision
Beta and future?? Its working fine on current svn.. And newer releases are going to use the api, which is less susceptible to stupid breaking changes. What version are you having problems with?? —Reedy15:01, 7 August 2009 (UTC)
AWB consolidates duplicate references and replaces the subsequent occurences with named references. This is not necessarily easy to spot in large diffs [4].
It might be helpful if this was automatically mentioned in the edit summary (similar to the note added for typos). -- User:Docu 13:03, 21 April 2009 (UTC)
Added in revision
It is my understanding (though only recently reached) that AWB will only do this with existing named references (i.e. it never makes up the names for refs, only propagates them. But yeah, it would be nice to see. - Jarry1250(t, c)17:44, 21 April 2009 (UTC)
Unfortunately the reason for some of the AWB general fixes may not be self-evident, but the edit summary is very short, so I don't think it's appropriate to start adding such things to the edit summary. If we added this there would be many others, and we'd soon run out of space, and also, what about the summary on non-English wikis.
Jarry1250's understanding is correct. I will update the information on the general fixes in the user guide when the named reference tidying is in the official release. Rjwilmsi17:56, 21 April 2009 (UTC)
I understand. BTW I thought I had references seen being named, but I might be mistaken. -- User:Docu
I have a custom module that gives references names (sometimes). Part of it might get into a future release of AWB, but not yet. Rjwilmsi18:20, 21 April 2009 (UTC)
Separate ref tag details from text, put inside references tag
Code went in just today to do this. This alone will vastly improve the editability of reference-heavy wikitext. Could AWB please do this as part of general fixes? (though it might want to avoid it in cases of {{reflist}}.) - David Gerard (talk) 23:52, 17 September 2009 (UTC)
Added in revision
Is there consensus on this? I don't think that there have been any major discussions recently, just one a few months ago, and unless there is overwhelming support it probably shouldn't be a genfix. (also, how it is done could vary from article to article depending on if it uses Harvard refs, full citations, or something else altogether). –Drilnoth (T • C • L) 23:56, 17 September 2009 (UTC)
Separating refs from body text is a long-requested feature - wikitext is presently almost unreadable and uneditable. I'm sure someone will consider editable wikitext a flaw, but I'd hope they were vastly outnumbered. As for referencing style, shirley leaving what's inside the <ref></ref>tags unaltered would be sensible - David Gerard (talk) 00:10, 18 September 2009 (UTC)
God no. It's a lot easier to do The line opened in 1859.<ref>Blow, p. 45</ref> than The line opened in 1859.<ref="Blow, p. 45"/> and then put the reference separately at the end. It's impossible for an automated process to determine whether moving it to the end is the right thing to do. (I could see consensus for moving already-named refs to the end, though.) --NE211:15, 18 September 2009 (UTC)
I also disagree with the proposal to have AWB force this recent cite.php feature change indiscriminately/automatically. That feature change may have been designed to address an editability issue that occurs in only one of wikipedia's referencing systems (namely, "[long] footnotes") but it does not really make any sense to apply it to the other systems. In bibliography-based referencing systems, like WP:CITESHORT and WP:PAREN/WP:HARV, the articles already have a separate list that contains the full references/sources, ie the bibliography. For citeshort all that's in-between the <ref></ref> tags is something like "Smith 2001: 123", or sometimes a discursive note. Replacing these with named tags and then listing them all out at <references/> doesn't simplify or help with anything. Almost the reverse, since this new cite.php change more or less tries to emulate what citeshort and other biblio-style systems do, only in a more complicated and randomly ordered way.--cjllwʘTALK03:41, 21 September 2009 (UTC)
There appear to be a couple AWB bots now trying to remove {{lifetime}} from articles. Was there a consensus somewhere I missed that this was desirable? (If so the template should probably be marked as depreciated). In any case, AWB appears to not be checking the input very rigorously: [5][6][7][8][9]. This is mainly GIGO, but I would expect AWB to do better input checking, such as correct number of parameters, year actually containing digits, etc. AWB also appears to be using the second defined parameter for the death year instead of just using the second parameter. --Pascal66601:37, 26 September 2009 (UTC)
Added in revision
There's no logic in AWB to remove lifetime. The bots are using their own logic, so better to contact them over these errors that they have not allowed for. Rjwilmsi06:24, 26 September 2009 (UTC)
It is a slight problem, the code can be tweaked easily enough to deal with problem X but there will always be problem Y. I am satisfied that Category:Births or Category:Living deaths (the latter can no longer happen but it made me laugh) etc. are better explicit than implicit, as any editor, even a newbie can immediately see where they are and that they are wrong. RichFarmbrough, 22:48, 29 September 2009 (UTC).
I am needing to null edit a whole range of pages at enWS to reset/recover a flag, and while I can do it in manual null edit, I cannot get the bot function to run, it skips with nothing to do. I don't have the skills to write a plug-in or a module, and don't feel ready to individually save 4000-5000 pages. Thanks for help or suggestions. billinghurst (talk) 15:10, 14 November 2009 (UTC)
The thing is that I am not trying to get around queued jobs. These pages will sit at enWS like this until they are resaved, and at WS our featured/completed works may never be resaved, they are not dynamic articles. For works in progress to have to go back and manually null edit pages and so many pages is all make work and has zero benefit, and makes no difference on the load that the servers will have to undertake beyond having 4000+ edits over the space of a week or two, rather than over 4000 minutes. I can provide further detail and examples and why. The only option that I now have is to perform faux edits on these pages to undertake some minor situational change to make it force an edit and therefore a save. Gee, it would be tons easier to get it to run through twice (once to add, once to remove) to avoid having save manually. That seems like more work on the servers than less for what is a try bot job. billinghurst (talk) 22:34, 14 November 2009 (UTC)
Currently, to use the database scanner, one has to decompress the downloaded database dump first. I suggest that the database scanner could alternatively be able to read the compressed file directly, because I think that the ca. 20 GB could be too much for some users. Of course this would increase the time needed for the scan (I don't know how much), but the user could always decompress the dump himself for better performance. (Note: I personally wouldn't use this feature, but I think it could be useful for others.) Svick (talk) 00:03, 27 November 2009 (UTC)
Added in revision
IMHO, if someone can't afford 20GB in todays market... The pc they are using, is most likely going to struggle on memory and/or cpu time to do this... —Reedy01:00, 27 November 2009 (UTC)
Unless you have an SSD. But anyway, performance of this would be terrible. Nice idea, but I don't think it's feasible. Better would be a "submit your query to server for execution" service. Rjwilmsi08:53, 27 November 2009 (UTC)
No. I am not sure at all. There may be hundreds or very few such case. But whats the harm in it? (AWB removes repeated words, so repeated sections can also be removed.) --Siddhant (talk) 08:11, 9 December 2009 (UTC)
I did an incomplete scan of the database dump and found 121 such pages in the main namespace, 160 in File and 13 in Wikipedia namespaces. All instances in the File namespace I looked at are really duplicates. In articlespace, many of them seem to be a level 2 heading directly followed by a level 3 heading with the same text. While these are not plain wrong, I think they should be fixed too. At least in one case (Glossary of chess#B) this is correct and should stay that way. (I didn't finish the scan because of a bug in my code, I'll fix it and scan the whole dump over the weekend.) Svick (talk) 01:46, 11 December 2009 (UTC)
After I finished the whole scan, I found 281 such pages in the main namespace, 4654 in the File namespace and 70 in others, so I guess it would make sense to implement this feature. Svick (talk) 15:05, 12 December 2009 (UTC)
These double sections are the one after the other? I think WP:CHECKWIKI has to start cleaning them for a while to see if there any problems and after that ask a bot to do it. Without supervision we can't be sure that there is no reason for these duplications. -- Magioladitis (talk) 22:25, 12 December 2009 (UTC)
Yes, directly one after the other with no text between them, but I counted them even when they weren't the same level. Svick (talk) 22:41, 12 December 2009 (UTC)
Scratchy & Co. reveals a bug in my code – I didn't disallow newline character in section name, so that the four equal signs, each on a new line, in {{Infobox Television}} on that page made it thing that there are two level 1 headings, both named newline. Svick (talk) 11:51, 13 December 2009 (UTC)
{{sic}} requires the incorrect text to be in the template, but not a whole sentence. Do you have some examples of how the logic should work? Thanks Rjwilmsi11:09, 9 April 2009 (UTC)
Currently there is no exception list in AWB. This means that it essentially applies every edit criteria every time for every article. I have frequently run across articles that meet the criteria for the change but should not be changed. An example of this is when I refine the rank links to be more specific (Captain (United States) vice Captain for certain military personnel in the United States. Some personnel where also military personnel in other countries such as Egypt, England, Mexico and others and it would be greatly benficial if there where an exception list that I could add this type of false positive.--Kumioko (talk) 18:01, 8 May 2009 (UTC)
Added in revision
There is a false positive button that can be enabled. Though, i can't tell you what/how it does or even if it works... —Reedy11:47, 9 May 2009 (UTC)
I mark this one as "exists". Enable "False positive button", while reviewing an article add it to the list if it's a false positive. I have false positives lists which I use to exclude articles from automatically created lists by using the Filter. -- Magioladitis (talk) 17:54, 25 December 2009 (UTC)
Currently when logging is enabled the logs that are generated are created with the AWB folder and after running for a few days make the folder very messy. I recommend adding logic to AWB to add logs to a logs folder when logging is enabled. This is also in keeping with standard programming practices.--Kumioko (talk) 15:18, 18 December 2009 (UTC)
I would like to request some new logic be added to AWB to eliminate duplicate stubs if AWB sees them in an article. I have found numorous articles with the same stub tag multiple times. I usually just delete them but for the sake of emphasizing this need I have left one intact. See James Lewis (United States Army) as an example of what I am talking about. Now I admit that this article is probably a poor example since it meets notibility guidelines anyway and will probably be deleted at some point in the future. But it shows the problem. Since this is already done for categories it seems reasonable that we should also be able to do it for stubs.--Kumioko (talk) 15:01, 2 January 2010 (UTC)
I found 206 such articles in the main namespace in the last database dump. Some of them are false positives (such as William Graham Sumner that contains twice redlink with comment <!-- add tag to new article when it is created {{US-sociologist-stub}} -->) and some of them have been fixed since the dump was created. I considered the stub templates to be the same only when they were spelled exactly the same. The list is at [12]. Svick (talk) 10:51, 3 January 2010 (UTC)
Svick, you are great. :) Maybe you fill in a request to CHECKWIKI as well? After that, I agree we have to implement this for AWB. --Magioladitis (talk) 11:14, 3 January 2010 (UTC)
This already exists in AWB. Why did nobody actually check? Unit tests to come... Rjwilmsi
I don´t have access to AWB right now but I think that Kumioko undid one part of the changes by accudent by double-clicking to the diff window. So, the what AWB originally tried was to "move stub at the bottom". --Magioladitis (talk) 20:38, 3 January 2010 (UTC)