Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
Bug

Pet Bugs? 1261

Posted by Cliff
from the bugs-even-industrial-strength-RAID-doesn't-kill dept.
benreece asks: "During my few years as a programmer/developer I've come across some strange bugs. Recently I found that Microsoft's VB/VBScript(ASP) round function has problems (for example, 'round(82.845)' returns '82.84' instead of '82.85'). It took me an annoyingly long time to realize the problem wasn't mine. I'm wondering what other obscure, weird, and especially annoying bugs in languages/compilers/etc have frustrated other developers." Memorable bugs. Every developer has one. What were yours?
This discussion has been archived. No new comments can be posted.

Pet Bugs?

Comments Filter:
  • Re:Pet Bugs? Oh Yes! (Score:4, Interesting)

    by frovingslosh (582462) on Wednesday June 26, 2002 @03:02PM (#3771774)
    The 6502 micro actually has my "favorite" pet bug. Since it was a 8 bit micro with 16 bit addressing, any indirect address stored in memory took two bytes. An undocumented (by MOS/Commodore) bug was that if the address spanned a 256 byte boundry, the first byte of the address would be fetched properly, but the next byte would come from a memory location 256 bytes prior to the location where it should have (internally the carry for the address counter didn't flow from the lower bye to the upper byte). I would love to know how many man centuries were spent on this one, I know myself and some fellow developers contributed to some of that lost time.
  • OpenGL Matrix Stack (Score:2, Interesting)

    by Bigmell (129905) on Wednesday June 26, 2002 @03:03PM (#3771782)
    I donno if this counts but I had a hell of a time (an entire semester!) figuring out why one of my animations worked properly then after a while (when idling) everything just shifted to the center (0,0,0)

    Basically I found that at the end of your display function you have to make sure you pop all your matrices off the stack, otherwise the stack fills and everything shifts to the center.

    The bug part is that OpenGL gives you no error! It just allows you to push to full stacks and pop off empty stacks all day with no error and no way to view the stack or set the stack back to zero.

    Our last project (for class) had to do with lighting so I included in the readme "After 15 seconds everything translates to the center. But the lights still work!" :)
  • by TibbonZero (571809) <Tibbon@ g m a i l . com> on Wednesday June 26, 2002 @03:03PM (#3771783) Homepage Journal
    You can get a TI-8x to say that there is a negative Kelvin temperature, by converting a negative Celcius to Kelvin, but according to TI, it's where you put the negative symbol, however, none of the other Temp converters screw up with the negative numbers.

  • by SuperMallen (156287) on Wednesday June 26, 2002 @03:04PM (#3771799) Homepage Journal
    I worked a wicked long time ago on the HotJava browser, and we were forever running into strange behaviors in the ways IE and Netscape handled what looked like normal HTML tags.

    My favorite was a bug we saw with a three column table. The table's three cells were specified like this:

    <td width="31"></td><td width="42></td><td width="29"></td>

    Being a good little HTML-compliant browser, HotJava displayed them with those pixel widths. But lo and behold! When displayed in Netscape, the table filled the screen.

    We bashed our heads against the wall to figure this out until we realized that the numbers added up to almost, but not quite, 100. Netscape was treating them as percentages rather than pixel widths, even though they lacked percent signs. The cutoff turned out to be somewhere around 104 to 96. Anywhere in there and the browser would assume percentages.
  • rounding bugs (Score:2, Interesting)

    by RussRoss (74155) on Wednesday June 26, 2002 @03:05PM (#3771803) Homepage
    My favorite rounding bug was the floating point bug in Applesoft BASIC on the Apple II series. A loop that adds .1 repeatedly to a variable would expose it pretty quickly. It was a basic part of life for years on that machine. To this day I am still very careful about comparing floats using an EPSILON value, i.e., the equivalent of abs(x - y) EPSILON rather than x == y.

    The Pentium bug was definitely a big one, but the Applesoft bug had more of an effect on me personally.

    I feel like a gray-haired old man showing my age and I'm only 25. Weird industry we're in..

    - Russ
  • 1 != 1 (precision) (Score:2, Interesting)

    by cacav (567890) on Wednesday June 26, 2002 @03:06PM (#3771825)
    My latest encounter with this bug was with Skill (a Scheme/Lisp derivative used in the Cadence VLSI design toolset). I've seen it in other languages as well.

    In my latest encounter, I'd do a bunch of calculation in a design automation program for us at work to use on a chip design, and I couldn't figure out why my numerical tests kept failing. I'd have some variable X, and do a bunch of functions on it like multiplication and division, then test to see if it was equal to 1. But none met the condition. When I printed out a bunch of the variables as floats, I saw that the test 1==1 was failing. I was confused to say the least. At first we suspected it was something like miscomparing a float to an int, but since Skill doesn't have floats or ints like that, it wasn't the problem.

    Turns out that it was a precision issue. It was really testing to see if 1==1.0000000000001 or something like that, because the * and / functions stored very precise values in memory, more than I'd care about. Ever since then we've always had to do something like atof(sprintf("%.3f",X)) to get the value without the extra precision. Stupid and very annoying.
  • by kawika (87069) on Wednesday June 26, 2002 @03:07PM (#3771834)
    When I go to the front page I see one set of topics. If I go to "older stuff" I see a few topics there that seem like they should be on the front page, but aren't. I haven't checked any boxes under the preferences page's "Exclude stories from home page" so I would think they would all show up.

    I know this must be happening for most Slashdot readers because the topics I don't see have maybe a dozen posts after a day. So is it happening to you too?
  • by Anonymous Coward on Wednesday June 26, 2002 @03:08PM (#3771844)
    According to the Jargon Files, this was a CPU instruction that would cause a hardware error. The example [cam.ac.uk] given was rapidly toggling some bus lines so as to cause them to catch fire :-]
  • by march (215947) on Wednesday June 26, 2002 @03:09PM (#3771857) Homepage


    As I reported in RISKS in 1997:

    DEC Alpha Bug?!?

    Wed, 02 Jul 1997 15:14:24 -0400

    So there I am, looking at our trading system and noticing that the price of one particular bond was different on two separate machines. Damn, I think. Must be a bug in the latest release of our software. Quick, do a sum on all the libraries. Nope, they are the same. Executable? Nope, the same.

    Hmm... Step through the code, hey, look at that! The pow() function is returning different results!

    So, I wrote a stand alone program. Sure enough, the machine with the latest rev motherboard (one that was just replaced by DEC) is producing bad numbers. Time to try 'dxcalc', DEX's X calculator. Yup. different numbers. How about perl? Yup, different numbers. How about 'bc'? Duh, bc doesn't take floating point powers. Hmm... check libm. Nope, they are the same.

    Bottom line: DEC will be here shortly.

    Test your alpha. Try 'pow(1.234567, 7.654321)'. If you don't get 5.017something, you have the same problem.

    RISKS? In our case, could have been a large sum of money.


    The final resolution was that DEC claimed to have a bad motherboard. Yeah, right....

  • by Kris Warkentin (15136) on Wednesday June 26, 2002 @03:10PM (#3771866) Homepage
    I just observed this bug a while ago while porting some software to windows. Do the following:

    fopen some file for writing.
    write some stuff.
    fseek to some offset near the beginning.
    write some more stuff.
    fclose.

    Simple right? Wrong. I observed that the second write didn't get performed unless you explicitly do a fflush before the close. Imagine, not writing dirty buffers out on a fclose....unbelievable.
  • Microsoft Visual C++ (Score:1, Interesting)

    by Anonymous Coward on Wednesday June 26, 2002 @03:11PM (#3771878)
    The code (without line numbers, silly):

    1 void function(void)
    2 {
    3 label:
    4 }

    Gives:

    (3) : error C2143: syntax error : missing ';' before '}'

    If you have a label immediately followed by a closing bracket '}' it won't compile. Which is annoyingly stupid.
  • by swm (171547) <swmcd@world.std.com> on Wednesday June 26, 2002 @03:11PM (#3771880) Homepage
    These are a few of my favorite bugs [std.com]

    Just to show how cool I am.

  • One of my favorites (Score:4, Interesting)

    by tuxlove (316502) on Wednesday June 26, 2002 @03:12PM (#3771886)
    Way back when, I worked for a company that produced a special version of SysV Unix. One of our developers was going through all the source code and looking for places where global variables were initialized to zero, thusly:

    int x = 0;

    and changing them to be implicit:

    int x;

    This had the effect of reducing the size of the data section of the binary and moved the variable to the BSS section. A simple and safe optimization, albeit somewhat anal.

    Shortly thereafter things started acting funny. The OS would slowly go crazy in very subtle ways after booting. It was not clear what was wrong or if anything actually was wrong, and nobody connected the variable initialization change to the problems with the kernel. After something like 3 months, they finally figured out that when this change was applied to a single variable in the C library it invoked a compiler bug that caused the library to be compiled in such a way that caused the kernel to fail to reset the CPU's floating point registers during a context switch. (How a faulty C library could cause the kernel to do this is still a mystery to me.) This is one of the weirdest bugs I've experienced, though I'm not doing it justice here due to fading memory.
  • by flatulus (260854) on Wednesday June 26, 2002 @03:13PM (#3771898)
    Get this: A few years ago I was doing real-time driver development for an embedded DSP subsystem using TI 54X processors. Months into the development, one of the processors started "losing" a serial port transmitter interrupt. This was an interrupt that (kinda like a machine gun) *MUST* fire every time, or it will never fire again.

    This was a major issue, because when the interrupt was lost the system froze up and had to be rebooted (this is an embedded app - not a desktop).

    I offered to assist the engineer responsible for this code. We spent two days tracing the problem in extreme detail, checking and cross-checking our results. We both concluded that the processor was simply "losing" the interrupt. There was no rational explanation. We adopted the countermeasure of using a fine grain watchdog timer to look for the lost interrupt. This isn't the best solution, since what was to keep the watchdog interrupt from being lost??? But it was the best we could do. And it worked.

    The project lead, however, was very unhappy with our solution. He was convinced that we had overlooked the cause of the problem, which had to be software-based. I countered that, though he could certainly be right, it would be better to leave the watchdog in and let the project move ahead until we stumbled across the real cause in due time. He reluctantly accepted this approach.

    My vindication took five months, but what sweet irony when it did. It turned out that some other company, which also used the 54X chip, had encountered the same problem, but they figured it out (and I'll never know how). The problem was that the 54X (at that time) had a silicon flaw that, when certain integer rounding instructions executed at the same instant that an interrupt were being asserted, the interrupt could be "lost". This was confirmed by TI to be a silicon fault, and no amount of software handstands or cartwheels could fix it. The only workaround was to not use those rounding instructions!

    OK- top that....

  • True story (Score:2, Interesting)

    by Beryllium Sphere(tm) (193358) on Wednesday June 26, 2002 @03:13PM (#3771899) Homepage Journal
    I was developing on one IBM mainframe, running on another. It worked fine on the development machine, consistently failed on the machine where I couldn't debug.

    Somehow troubleshot it to an error parsing a data file. Ran some tests to see if the (suprisingly exotic) code for transferring the data file from one system to the other was broken.

    It wasn't, but the test procedure did include making a copy of the data file, and the COPY command put line numbers into the file even though there weren't any to begin with.

    The workaround was to use the NONUM option on the copy command. That was documented behavior, so you could argue that it was programmer error, but I wouldn't agree with you.
  • by schon (31600) on Wednesday June 26, 2002 @03:13PM (#3771901)
    My all-time favourite bug is in the microcode of the 6502/6510...

    An indirect jump where the source address was on a page boundary caused the high-byte to be pulled from the beginning of the current page, instead of the beginning of the next page..

    eg.
    $0100 holds $80
    $01FF holds $32
    $0200 holds $14

    then the command

    JMP ($01FF)

    would load the program counter with $8032, instead of $1432

    First time I saw it used was in some copy-protection code in the C64 version of Sim City.. It was some obfusication to screw up beginning crackers.. (it threw me for about 5 minutes..)

    Ahh, those were the days :o)
  • by Kook9 (69585) on Wednesday June 26, 2002 @03:15PM (#3771916)
    I think one of the greatest joys (and greatest frustrations) of programming is finding some erroneous behavior in your code is caused by a third-party library and is thus not your fault. Once I stayed at work with my manager until 10PM because text that was formatting properly on my development system was not formatting properly in production. After beating our heads against the wall for hours tweaking the code, recompiling, making sure it ftp'd correctly, etc., we realized it was a JRE bug. Oh, sweet vindication!

    Now you might ask why my development machine had a different JRE than production... You'd be right. This is not a company that is still in business. 'Nuff said.
  • by PD (9577) <slashdotlinux@pdrap.org> on Wednesday June 26, 2002 @03:17PM (#3771944) Homepage Journal
    I once worked on a DOS app (about 10 years ago) that had a very strange bug. The system would lock up for no reason we could figure out. The offending statement was a printf that printed one single character (a control-G) to the screen to make it beep.

    After some testing on different computers, we discovered that it only crashed on machines with an AMI bios. Phoenix BIOS machines worked just fine, and so did real IBM machines. We never digged into the BIOS code to figure out what the problem was, but we mentioned it to the Phar Lap support people (those people were the smartest support people I ever ran across). They told us that they knew about the bug, and even explained what was happening inside the bios to make it crash the machine, but it's been 10 years so I don't remember the details anymore.
  • by Elwood P Dowd (16933) <judgmentalist@gmail.com> on Wednesday June 26, 2002 @03:18PM (#3771946) Journal
    As far as I could tell when I was using Perl, running under strict mode would make it so that print() only worked with strings that ended in \n. I can't tell you how long that takes every beginning Perl programmer to figure out. Took me a good four hours.

    My favorite bug in slashcode is that clicking "Parent" in my default story view always returns the default story view, not the parent of the post I'm clicking on. So I have to click on the post ID number, then click parent on the resulting page.
  • by west (39918) on Wednesday June 26, 2002 @03:20PM (#3771987)
    Was in the C compiler in the old Ontario ICON computer, which used a variant of QNX.

    If you had a variable that happened to be the same as the name of a function, then the compiler wouldn't complain, but it would use the address of the function as the value of the variable. Took me a **long** time to figure out where it was getting that value from.
  • Bugs in DOS (Score:4, Interesting)

    by Bistronaut (267467) on Wednesday June 26, 2002 @03:23PM (#3772018) Homepage Journal

    It's turn-back-the-clock time, boys and girls. Remember all of those DOS calls? It was interrupt 20, wasn't it? Remember the findfirst and findnext functions that would get you a list of the files in a directory? You could give findfirst a list of attributes and a filespec, and it would give you a file that matched it (findnext just repeated the last findfirst). Valid attributes to pass were the archive flag, read-only, directory, etc. Except the directory one didn't work! It was simply ignored, so you had to sort out what files were directories or not yourself. What a pain in the ass! And did they ever fix it? I'll give you one guess.

    Oh, and I can't mention old MS bugs without mentioning MASM vs TASM (just because it illuistrates why Borland is so cool and MS is not). Back in the day, when applications were coded in assembler, MASM (Microsoft Macro Assembler) was popular as hell. Borland, though, came out with Turbo Assembler, which had a better syntax (optionally), could assemble MASM syntax faster than MASM could, and could emulate all the bugs in the different versions of MASM. Ah.

    Well, that's enough MS bashing for me today (or maybe just this hour...).

  • by MrFenty (579353) on Wednesday June 26, 2002 @03:33PM (#3772149)
    It's nice to think of bugs that can make us smile, but I was shook up by one bug in hardware, the infamous Intel Pentium f00f bug [ddj.com]. This caused calculations to go wrong, sometimes. Not often, quite rarely, but sometimes. Now, my job (or anyway, part of it) was to design programs that ran calculations to calculate the risk of a serious (fatal) genetic disease being inherited by an unborn from its mother. For some serious diseases, the hospital I worked at offered abortions for those mothers, if they wished. Whatever your views on such terminations, I made damned sure my code was clean and as bug free as I could make it. Then, I come across Usenet messages that the Pentiums I am using have a floating point bug. This is when such bugs become - literally - life or death problems for users. I believe that we were never effected (we reacted quickly and ran on 486's until we found that the cpu's we had were bug free), but it is a sobering lesson.
  • by deebaine (218719) on Wednesday June 26, 2002 @03:35PM (#3772177) Journal
    This is a popular, popular bug. A look at the NS rendering engine, I've heard, shows that NS always uses percentages, and in fact rounds pixel values to then nearest percentage. I have not personally browsed the code, but a quick test on a frustrating page I once did confirmed the performance. The difficulty, of course, is that your granularity goes down as resolution goes up. That is, on a 1024x768 monitor, you can have a minimum of 10 pixels in width; on a 1600x1200 monitor the minimum becomes 16.

    One of the most annoying bugs I have ever faced, as there really are only a few acceptable workarounds.

    -db
  • by mcc (14761) <amcclure@purdue.edu> on Wednesday June 26, 2002 @03:35PM (#3772179) Homepage
    As others have noted, this isn't a bug, these are just the stories that the editors decided weren't important enough to warrant a full front-page thing. Funnily enough, these "section page only" articles tend to have much better and more insightful comments than the front page articles, because people only post there if they really care :)

    Beyond that, though, what i liked was that used to, on slashdot, you could post to sid's that didn't exist. Like, you could go to http://slashdot.org/article.pl?sid=haiku [slashdot.org], or something, and while there wouldn't be a story at the top of the page, you could post comments there, and the next person to go to http://slashdot.org/article.pl?sid=haiku would see that comment and could reply to it, until the comment reached a certain age and was automatically deleted. There used to be a whole bunch of these little "hidden" discussion areas littered all over slashdot that people would form entire little communities around them. Unfortunately, this was mostly used for troll groups to coordinate attacks. (K-9-something-inches or something? I don't remember.)

    Unfortunately they seem to have removed this feature from slashdot :( Unless i'm just confused about how it's done, anyway. But it seems to be disabled, going to a non-existent sid now shows "Nothing for you to see here, move along".

    There were some other really bizarre but fun slashdot bugs, like how there was some wierd twilight zone area at sid 0, or sid null (or something.. "slashdot.org/?sid=", i think was the url.. i can't remember. i think it was called "test discussion". or something) that you'd sometimes get dumped at if you clicked on the "parent" link in the preview of a post you were writing. Not always, just sometimes. The thing was though, there was some other bug that for some unfathomable reason would sometimes cause posts to get moved out of their correct threads, and into the null discussion, at random. And people wouldn't notice this. And so if you went to the test discussion, you'd just see hundreds and hundreds of random posts, totally irrelivant to each other or anything else, on totally random subjects. It was fun to go through this and try to guess what subjects the posts were on.

    And then there was.. i barely even remember this one. There was a page i managed to get to a couple times-- i can't remember how, but there was a simple way to do it that would work every time-- that just said, "Here are some open discussions", and linked a bunch of articles. The Test Discussion was always near the top of this list. I'd expect that whatever this page is, it's gone now, but can anyone remember what this page was or how i would have gotten to it?
  • by greensquare (546383) on Wednesday June 26, 2002 @03:41PM (#3772237)
    I just figured it out. In vi, make a mark named "d" ( for those who are limited out there, do this by simply hitting "m" and then "d", no ":" is required ) next move down a line ( "j" key, or down arrow for the limited ones ) Then hit do a change to that d mark. ( type c'm ) Do this on Solaris and vi will core dump. vim 5.8.7 on Redhat 7.1 seems to be fixed.
  • Slashdot Slashdotted (Score:5, Interesting)

    by No Such Agency (136681) <abmackay@g m a i l . com> on Wednesday June 26, 2002 @03:43PM (#3772262)
    "All those times when slashdot is messed up and you click on a comment only to be brought back to the front page..."

    It's not a bug, it's a way to avoid returning a 404 when the server's busy. When this happens, it's because the Slashdot server is experiencing a huge load, and it can't keep up with delivering dynamically-generated pages to each visitor. So instead there's a backup static page which it serves you instead. Unfortunately, this function also kills "Slashdot Lite" which I _need_ on my 28.8 modem...
  • by orangesquid (79734) <orangesquid@y a h o o . com> on Wednesday June 26, 2002 @03:45PM (#3772282) Homepage Journal
    Actually, I found quite a strange bug in gcc 2.95. Even with *all* optimisations disabled, this is the result I would get:

    int a = 4;
    complex_expression(dosomething(arg1, arg2, arg3, a));

    Every single time, it would evaluate incorrectly. `a' would be a random value. Seems gcc wasn't assigning it to be 4. So of course I tried:

    int a;
    a = 4;
    complex_expression(dosomething(arg1, arg2, arg3, a));

    I got the same problem. gcc was still not assigning it. The next logical step:

    complex_expression(dosomething(arg1, arg2, arg3, 4));

    It evaluated correctly---the problem had to do with the variable.

    Now here's the fun part.

    You'd think the following code would print ``4'', and evaluate incorrectly, given the trend:

    int a;
    a = 4;
    printf("%d\n", a);
    complex_expression(dosomething(arg1, arg2, arg3, a));

    It printed ``4'' and evaluated perfectly! I was stumped. I tried a dummy function:

    int a;
    a = 4;
    dummy(a);
    complex_expression(dosomething(arg1 , arg2, arg3, a));

    No good. The dummy function gets passed some bogus value and the expression evalutes incorrectly.

    I knew that the printf() method worked, and it seemed to work reliably, but I didn't want to always have to print the value. This, however, worked, and didn't clutter the screen:

    int a;
    a = 4;
    printf("", a);
    complex_expression(dosomething(arg1, arg2, arg3, a));

    With some more experimentation, I found dummy(&a) would make the code work too, but only sometimes.

    Strangest bug I've ever seen, and I'm still not sure what caused it.
  • Hmm. My favorite bug (Score:2, Interesting)

    by i_am_pi (570652) <i_am_pi_.hotmail@com> on Wednesday June 26, 2002 @03:53PM (#3772376) Homepage Journal
    in win2k/xp

    #include <stdio.h>

    void main() {
    printf("\t\b\b\b\b\b\b\t");
    }
    Can you say "Bluescreen"?
    Pi
  • by molo (94384) on Wednesday June 26, 2002 @03:59PM (#3772442) Journal
    This program will crash Windows NT with a BSOD. This works on NT 4, Win2K, and WinXP from an *unprivelidged* account. There is no known fix available from MS.

    main () {
    for (;;) {
    printf ("Hung up\t\b\b\b\b\b\b") ;
    }
    }

    More information is available at:

    http://homepages.tesco.net/~J.deBoynePollard/FGA /c srss-backspace-bug.html

    This is why I don't run windows.
  • The answer ... (Score:2, Interesting)

    by triptmind (546163) on Wednesday June 26, 2002 @04:00PM (#3772462)
    The reason that round "isn't working properly" is the law for rounding when the last digit is a 5, is to round to the nearest even number. That is one of the 4 main rules for significant digits. Here are some examples of this rule.

    85 = 8 x 10^1
    80.35 = 8.04 x 10^1
    80.25 = 8.02 x 10^1
    125 = 1.2 x 10^2
    135 = 1.4 x 10^2

    Test these examples, you'll find they're all correct. As for favorite bugs of mine, I just love the bugs that I DON'T have. =)
  • by Anonymous Coward on Wednesday June 26, 2002 @04:05PM (#3772523)
    On very old (version 2.0 IIRC)Microsoft Visual C++ compilers the following code snippet would cause the compiler to grind away for a long time (I got it to go for 2 hours), finally run out of memory and crash.

    void main()
    {
    delete(int);
    }
  • by cjhuitt (466651) on Wednesday June 26, 2002 @04:06PM (#3772537)
    One of the most frustrating bugs was one I encountered in an early programming class in college. I had finished my program, and was testing it. Of course, it wasn't correct. I fixed every bug I could find, but it still wasn't correct.

    So, not knowing about fun things like debuggers, I started putting some cout statements in the code, to check the value of variable at different locations.

    The variables were all correct. And so was the output.

    I started removing cout statements, and found out that when I removed one particular statement, the program started giving incorrect values again.

    Print the value, program output was correct. Don't print the value, it was incorrect.

    So, I experimented some more, and found that I could do a variety of things, such as swapping a couple of my statements, and the program ran correctly without the output of the value.

    I pretty much forgot about this, until another class a couple of years later. We were using the same compiler (I beleive it was Borland 5), and I found the exact same problem. This time, however, I couldn't swap statements around to get it to go away. I needed a statement that would do nothing.

    So, I gave up and assigned the variable to itself. (i = i;).

    Imagine my surprise when the program worked correctly. I brought over my TA, and she couldn't make heads or tails of it either. But every time, if that assignment was commented out, the program was wrong, but if it was left in, the program was correct.

    I eventually learned about pipelined instructions, and how a compiler has to be careful that a memory address has the correct value before using it for another statement. I'm pretty sure that's what was going wrong, but I always remember my magical solution of assigning a variable to itself to make it the correct value.
  • by lightcycler (587416) on Wednesday June 26, 2002 @04:08PM (#3772557)
    Pet bug? Try sending an ASCII-zero '\0' to the serial port using MS Visual C++

    Yes, it needs a string. Yes, zero is the end of the string. No, you can't send arbitary files to the serial port. Duh!
  • by huh_ (53063) on Wednesday June 26, 2002 @04:18PM (#3772663)
    Also, In Windows 2000, go to the command prompt, and cd to a directory with lots of files. Do a dir, and while its scrolling past, press F7,Enter,F7,Enter.. over and over.. It crashes every damn time.
  • by metalwheaties (166393) on Wednesday June 26, 2002 @04:33PM (#3772834)
    WAY back in the old line printer days, a former coworker of mine worked as a summer intern at DEC. He was screening bugs in TOPS10 - a mainframe OS in the 70s and early 80s. He got a bug report, with attached tape containing a binary file, that complained: When the attached file is printed on the line printer, the printer catches fire. The REPEATABLE box was checked!

    It seems that a lazy computer operator (remember those guys?) didn't like getting up from his chair to separate the users' print jobs. So, laziness being the true mother of invention, he created a job trailer banner page that included hundreds of overstruck lines completely filled with "_" characters. These had the effect of hammering a line across the paper that eventually cut the paper off at that point.

    Apparently, the drum printers of the day couldn't survive hitting all solenoids simultaneously on every rotation of the drum, so they were overheating, causing (at least) some serious smoking, and maybe a bit of excitement.

  • by Fulcrum of Evil (560260) on Wednesday June 26, 2002 @04:33PM (#3772840)

    The buttons below were pretty cool too. One said "ok" and the other one said "lame!"

    The lame thing was a hack on the dialog code. BillG made a fairly big deal out of bad or confusing error messages, so the devs got the idea to do internal builds with the extra button on every dialog so that you could report a confusing message on the spot. Pretty cool, actually.

  • by kevin@ank.com (87560) on Wednesday June 26, 2002 @04:37PM (#3772873) Homepage
    Recheck your prototypes or compile with gcc -Wall. Either that or if you are working in C++ then one of the other args may be going out of scope before you expect it to... like you've written custom constructors and destructors, but blew it on the copy constructor.

    Printf making it work is irrelevant. That just means that something that is referencing garbage happens to be seeing some data that has the value you want it to use. Not surprising since you've been playing with it on the stack. If all else fails you can use electricfence to track down the violation.

  • by vrmlguy (120854) <samwyse AT gmail DOT com> on Wednesday June 26, 2002 @05:41PM (#3773813) Homepage Journal
    I once had to debug someone else's code that looked vaguely like this:

    READ A,E,I,O,U
    [...]
    X=A+E+I+0+U

    See the problem? Note that in 1976, programmers would write their code on a form that was given to keypunch operators, who "typed" it onto 80-column punch cards that were then fed into the computer. When the author got back from vacation, I refrained from punching him in the face, and just yelled at him instead.
  • by NotoriousQ (457789) on Wednesday June 26, 2002 @05:55PM (#3774002) Homepage
    Actually, it seems that this bug is a bit deeper than just the libs. One of the post below mentions the same bug in java in windows, and I have noticed that the filestream object in .Net also fail to write out its 1k buffer.

    so many systems, so much the same bug
    (or is it a feature?)
  • by tyler_larson (558763) on Wednesday June 26, 2002 @06:16PM (#3774223) Homepage
    Here's a simple program with some unexpected consequences. It works only on windows NT-based systems, including XP.

    #include <stdio.h>
    int main()
    {
    printf(" \b\b ");
    return 0;
    }
    To get the full effect you have to run it by double-clicking on the icon, rather than from a DOS prompt. If you want one you can run from a command prompt, replace the printf above with:
    while (1) printf(" \b\b");

    An infinte loop isn't quite as elegant as a single statement that wreaks havoc on your system, but it's still simple enough. In order to generate the "desired" result, you have to backspace beyond the first character of the terminal window, then output a printing character to the left of the beginning of the buffer. Apparently cmd.exe doesn't check for this condition, and triggers an error in a system-critical process.

    I remember Microsoft bragging about how DOS programs run in their own virtual machine, so a mis-behaved DOS app can't crash your computer. I think this example here is proof-positive to the contrary.

    If anybody has any more technical information about the cause (and possibly history) if this bug, I'd love to hear it.

    What's it do? Oh, yeah, it reboots your computer. No shutdown, no warning. Just like hitting the power switch.

    And aren't you glad you paid over $1000 for MS server software that can be rebooted by any user who executes a 4-character printf?

  • by DotComVictim (454236) on Wednesday June 26, 2002 @06:41PM (#3774509)
    When doing an rcp from source machine to target machine, if sum is run on the rcp source machine, the value would sometimes be incorrect. After the rcp finishes, the value is correct, there is no data corruption, and the file was transfered correctly to the target machine. If ftp was used, the problem did not occur.

    It took over 6 months and 12 people to find the problem. The hardware was a uniprocessor MIPS R10k with non-coherent cache. The processor is capable of doing speculative execution which can dirty cache lines. The processor doesn't back out dirty bits when the speculative path falls back. So you can have a piece of code like:

    if (foo) *bar = 1;

    Even if foo is false, the speculative execution can cause the cachline containing bar to get marked dirty. Normally this doesn't cause a problem. However, if bar is used as a loop variable, and happens to point just past the end of a memory page, a cacheline for a subsequent page can be dirtied. If this page has an active DMA transfer in progress, then the first cacheline on that page can be overwritten with the dirty cacheline, corrupting the DMA data.

    This was not a problem for userspace, since active DMA write pages were not mapped into userspace, but flipped in on completion of the DMA. In the kernel, the problem exists. The solution chosen at the time was to put a compiler workaround, which put a speculation stopping instruction at each conditional branch target. Since this compile switch was only used for the kernel, user binaries remained ABI compliant.

    However, in "volatile" assembler portions of the kernel code (no compiler reordering permitted), this workaround had to be handcoded. After pouring through all the assembler by hand, no bugs were found. Finally a perl script was written which would check for store instructions lacking a speculation stopper. Some were found, and all discounted as harmless.

    The problem turned out to be that the MIPS prefetch instruction allows you to pass a cache hint. There was a piece of checksum code that passed a write hint in a prefetch instruction. The fix turned out to be a 1 bit change: change the 7 prefetch code to a 3.
  • Re:Bugs in DOS (Score:2, Interesting)

    by jmooney (138902) on Wednesday June 26, 2002 @06:43PM (#3774541)
    The weirdest set of symptoms I ever had to diagnose was due to one or two bugs in the Microsoft DOS Linker (circa 1985, version 3.61 I think). I was linking C code and Quickbasic code into one executable of about 200-300k (that was a lot when 640k was the system max).

    The first symptom was that I ran my program from DOS, and the previous program that I had been running under DOS sparked back into life briefly then the system hung or rebooted - different each time I ran my broken executable. What was happening was that the DOS loader was not loading the last 64k of the .exe, which was where the entry pointer was, so it just jumped to whatever happened to be in memory already. The linker was somehow getting the filesize word in the .exe header 64k too low (can't remember how long it took to figure that out). I fixed this with a program run by my makefile that checked and if necessary patched every .exe file as soon as it was linked.

    The second symptom was about a year later, in a different version of the linker (still buggy), when I started using MS link .exe compression. I got heap corruption in one part of my code. I added in one debug message, and the problem went away, but came back when I took out the debug. Almost any change anywhere in the program (any source file) changed the symptom. After about 80 hours in the debugger tracking through godawful quickbasic initialisation and memory management, I found there were about 5 bytes of corruption in my static strings. The MS linker .exe compression did simple run-length compression on the executable, and appended some decompression code onto the end of the executable. The problem was that the initial stack pointer used by this decompression routine was supposed to be beyond the end of the file in free memory, but instead it was 64k before the end of the .exe, and whatever happened to be 64k before the end of the .exe got a hole punched in it during the .exe startup code introduced by the linker. I changed my patch program to patch that pointer after every link too.

    I spent a long time fuming at the weeks that I had lost just on this bug over the years. I never did figure out what it was about my program that tripped the bug in the linker.

    Aside: I maintained and supported that program until the early 90's... the second-last native application I developed for a Microsoft Operating System. I got a job in UNIX systems in 1992, in 1995 my company was looking for a way to go GUI, I did one experimental Windows fat-client/server program in early 1995... the last native executable I did for a Microsoft OS. I recommended my company write our GUI for the web instead, and we got a 2-4 year lead over our competition worldwide.

  • by elronxenu (117773) on Wednesday June 26, 2002 @07:08PM (#3774819) Homepage

    I discovered that the network concentrators at Uni would die on any sequence of 4 "n"s in a row (i.e. "nnnn") in the same packet. I was trying to read a man page and puzzled why the system kept dying before I got to the end. Eventually I redirected the man page to a file and used something like an octal dump to find the sequence without displaying it on the screen.

    I then tested, typed "nnnn" and down went the network concentrator. Unfortunately that killed the other 15 users as well...

    I reported this bug to the University Computer Centre who either didn't believe me or took no apparent action, but sometime later they upgraded to a different brand of gear for the campus WAN. I also reported it on comp.risks. I can find no other documented cases of this bug on the Net using a google search.

  • by Christopher H (25358) on Wednesday June 26, 2002 @07:23PM (#3774940)
    I used to be on UWaterloo's ACM programming contest team. More than once I got bitten by the ever-so-simple yet ever-so-annoying to debug 'missing return' phenomenon.

    It seemed that on some architectures (eg. the local workstation I was testing on), the right value would just happen to be in the right register anyway. On the judges architecture (fortunately this was a local practice contest, not the world finals!) it failed one time out of one hundred. Yeah, yeah, I know, turn on warnings... it wouldn't fit with the contest mentality: vi + gcc and one terminal and you're set.

    Wasted waaaay too much time on that one.
  • by Anonymous Coward on Wednesday June 26, 2002 @07:29PM (#3774980)
    And, for the record, the most solid workaruond is to set each cell to 1% width, and then use transparent GIFs to push out each cell's width.
  • by Anonymous Coward on Wednesday June 26, 2002 @07:36PM (#3775039)
    How to BSOD Linux:

    opensocket();
    closesocket();
    writesocket();

    Oh, and about 10,000 others. Typical Linux FUD, always pointing out flaws in others while completly ignoring their own.
  • by bill_guts (543228) on Wednesday June 26, 2002 @07:51PM (#3775158)
    From "SQL for Smarties" by Joe Celko, 2nd edition, p.51:

    "The scintific method looks at the digit to be removed. If this digit is 0, 1, 2, 3, or 4, you drop it and leave the higher-order digit to its left unchanged. If the digit is 5, 6, 7, 8, or 9, you drop it and increment the digit to its left. This method works for small sets of numbers and was popular with Fortran programmers because it is what engineers use.

    The commercial method looks at the digit to be removed. If this digit is 0, 1, 2, 3, or 4, you drop it and leave the higher-order digit to its left unchanged. If the digit is 6, 7, 8, or 9, you drop it and increment the digit to its left. However, when the digit is 5, you want to have a rule that will round up about half the time. One rule is to look at the digit to the left: If it is odd, then leave it unchanged; if it is even, increment it. There are other versions of the decision rule, but they all try to make the rounding error as small as possible. This method works with a large set of numbers and is popular with bankers because it reduces the total rounding error in the system."


    VB/VBA uses a commercial method. That used to be my Team Leader's favourite question to ask during an interview: "What's the difference between scientific rounding and commercial rounding? Which one does VB use?" No one ever got it right while I was at the company. Most programmers didn't know about the two methods at all.

    As for your tax example (UK, Canada, US, whatever), the gov't does not need to be accurate on tax forms. They can just raise taxes if they can't balance the budget. ;)

    Do you smell that?
  • Re:Fraud Case (Score:2, Interesting)

    by superposed (308216) on Wednesday June 26, 2002 @08:02PM (#3775239)
    ...there was once a case of fraud where a programmer for a bank rounded all the part pennies to a particular account.

    I think this was first tried in a different case in 1983. A few details can be found here [snopes2.com] and here [gnucash.org].
  • inc vs. add (Score:3, Interesting)

    by coyote-san (38515) on Wednesday June 26, 2002 @08:02PM (#3775244)
    When I was in college, we had an introductory class to digital logic (for physics majors) with the emphasis on switches, latches, etc. We also had a single-board processor (8080) that we programmed with hand-assembled code punched into the hex keypad - one of our first projects was *always* to set up a binary->7 segment display encoder so we could read hex output instead of the binary.

    Anyway, our instructors were physics profs who focused on the hardware and never really put any effort into describing the instructions available on an 8080. We had been working at an extremely low level of logic design. At one point we had to write a program to add two numbers and display the results, and I actually wrote one looking something like

    l1: inc ax
    dec bx
    jnz l1

    I was truly dreading doing multiplication and division, but fortunately someone pointed out the basic math opcodes first.

  • by Anonymous Coward on Wednesday June 26, 2002 @11:10PM (#3776524)
    Awhile back, there was a bug in the way the CRT initalized some timer increment in Borland Pascal. It tried to calculate how many clock tics it took to complete a loop, then used that as the divisor in another calculation. But any computer that ran over ~200Mhz could complete the loop in under a tic, so a /0 error would pop up. It was kind of frustrating to have a program work perfectly while my laptop was unplugged, but get an error *before* the first instruction while it was plugged in.
  • Re:True story (Score:3, Interesting)

    by dubl-u (51156) <2523987012.pota@to> on Thursday June 27, 2002 @12:50AM (#3776944)
    I'd wager that the majority of those who follow the herd and say, "VB is an idiot's language," have never even tried VB.

    I've tried VB; I used it by client mandate on a 4-month project. I started with only a slight bias, and ended up hating it with a burning passion. I wouldn't say it's an idiot's language, but I would agree that it makes weaker programmers.

    Why? Let's ignore VB for a bit and compare Pascal, Perl, and BASIC.

    Pascal is a language made for instruction. It's uptight, a bondage and discipline language [tuxedo.org], forcing you to program in a way that's pretty orderly. Developers who start with it tend to follow those habits later. Perl is neutral; it will allow you to program in a way that's as fussy as Pascal, but it will also let you write utterly perverse code. BASIC, has a weird set of limitations and freedoms that actively train people in bad habits that they will have to unlearn later.

    In my experience, VB isn't as bad as BASIC, but it was pretty close. My major complaint is the complaint I have about a lot of Microsoft stuff: on top of a skeleton of early, rushed design (kept for the sake of backwards compatibility) somebody glued multiple layers of marketing's features d'jour, doing so without any particular appreciation for elegance.

    A good programmer could spend some time working in VB and live; if they've had some broad exposure, then then won't learn the bad habits that VB allows and the tolerance for ugliness that VB requires. But somebody whose main experience is VB is another story entirely.

    I know of one group of former VB developers who are now Java developers. The whole way through they've worked on a large dynamic website for a major financial institution, which they redid in Java a while back. For the most part the site works, but the code brought me to tears.

    In trying to help these people (who were all smart, committed, and nice) the main problem was that 90% of them couldn't tell good design from bad, elegant architecture from tangled, or clear code from obtuse. I blame this directly on their years of working with a language whose designers didn't know the difference either.
  • by johnw (3725) on Thursday June 27, 2002 @06:30AM (#3777816)
    > OK- top that.

    Some years ago I was working on Prestel software running on the GEC 4000 series mini-computers. One particular problem affected the system at startup (when it was extremely busy for about 30 seconds) and took an awful lot of tracking down.

    The GEC 4000 series was (is?) a real-time system with inter-process message passing built in to the CPU. You load values into registers (one of which will cause a chunk of memory to be passed to the other process) then execute a SEND instruction and away it goes. You can do a GOFREE to accept incoming messages from any source, a WAIT to accept them from one nominated source or a SENDWAIT to send a message and wait for the response. When a message arrives the register values are automatically loaded into the registers for the receiving process and then the process continues.

    At startup the Prestel system had about 200 processes all frantically sending messages to each other. On odd occasions one process would crash, having apparently received a garbage response from a system process. Lots of heavy debugging, (including stopping the whole system and printing out large chunks of memory on the console (TI Silent 700 thermal paper - remember those) seemed to confirm that the system process was sending back garbage in response to a request.

    I reported it to the OS guys who took a lot of convincing. After a lot of pressure they agreed to investigate and reluctantly agreed that it wasn't an application fault. In the end it turned out that it wasn't an OS fault either - it was a bug in the CPU. Under heavy load when executing a SENDWAIT the GEC 4160 would very occasionally do neither the SEND nor the WAIT, but allow the process to continue with whatever values were in the registers before.

    Explained like that it sounds simple. Working from the sharp end in the field it was anything but.

"Free markets select for winning solutions." -- Eric S. Raymond

Working...