Forgot your password?
typodupeerror
Quickies

Handling 'Unexpected Interrupt 0D' Errors Under NT? 59

Posted by Cliff
from the low-level-serial-programming dept.
Jersiais asks: "I am trying to get some command line stuff running on NT4 server with Take Control installed on an old 200MH Pentium II (Before anybody throws up, it's the test-it-&-wreck-it machine, not the real thing so there's no actual LAN there). Even on the real thing the compiler under command line has a tendency to blow up at random with 'Unexpected Interrupt 0D'. This only happens on the Pentium II, on the real (Workstation) thing it doesn't. I've found 3 different descriptions of Int 0D, none of which make any sense. Anybody any ideas how to get around it, or get rid of it? The compiler is 32-bit to interpreted intermediate and I have a RP calculator running as a test on the work system already, despite its use of soft interrupt IO."
This discussion has been archived. No new comments can be posted.

Handling 'Unexpected Interrupt 0D' Errors Under NT?

Comments Filter:
    • Parent is right.
      http://www.google.com/search?hl=en&lr=&ie=UTF-8&oe =utf-8&q=%22Unexpected+Interrupt+0D%22&btnG=Google +Search [google.com]

      It's right there. Again, this doesn't belong in Ask Slashdot. It belongs on usenet, in one of the asm groups. Alternately, just use google, it's right there. Blargh. Why don't people do basic research before posting an ask slashdot?
      • I did and it doesn't apply. That's why I posted it here. It doesn't occur at any specific point where I could guess my software is incompatible and there's no drive access going on. Unless it occurs as a 'normal' feature and NT usually handles it out of the way. It isn't even consistent between different user interfaces. What I was looking for was a way to integrate old command line stuff into NT with a reasonable user interface, which Take Command more or less provides (and to find out what it will do as compared to what it's supposed to do). I have got it through since - sometimes - but I guess it's just a case of do it MuckySoft's way or forget it. There are times when sodding about with setting windows up is not worth the effort compared to a quick dirty command line. But I was hoping it needn't be quite as dirty as what's provided!
        • I'm not sure I understand what you're saying. And I'm not sure of your ASM proficiency level, so I'll go into some details that may be redundant to you. And I think you might have said you resolved the problem, so I dunno if this matters anyway.

          According to http://swatch.binary.com.tw/delphi-ti/19057.html [binary.com.tw], found with Google,

          Interrupt 0d is the general protection exception and is generated by any protection violation that does not generate some other exception. See the above question for a more complete description of the problem. Common causes of this problem are network boards and certain hard disk controllers.

          Interrupts can be either software generated, or hardware generated. Assuming this is a hardware generated interrupt, it's set when the processor receives an IRQ (Interrupt ReQuest). In this case, the processor recieved an IRQ for int 15. From the article, we get that IRQ 15 is our old friend, General Protection. Here, General Protection is (most likely) protecting us from bad hardware. If you trace through your code, or set AfxMessageBox() calls in your code in key places, you should be able to trace where the fault occurs. (AfxMessageBox() does block the thread until you hit OK, BTW.) At this point you should have figured out where the fault gets flagged, and from here you diagnose exactly which hardware is bad.

          If you haven't figured out the problem this way, generate checksums of the file, both on the faulty hardware, and the good hardware, to see if it wasn't changed due to a faulty HDD. If the checksums look OK, then test your memory. If that tests OK, then you may be looking at a faulty CPU. Check to see how hot the CPU gets, that may be what's generating the error.

          Or it may be something else entirely. Debugging flakey hardware in software is often quite tricky. I've thrown out MoBos before after diagnosing that something on the MoBo was broken, but never knowing exactly what it was. And I'll do it again. Oftentimes, diagnosing hardware isn't worth the headache.
          • I think it's a speed thing. It turns up in faster machines too but not as frequently. The compiler was probably designed for a 486 or even 386, windows yes but not NT. On a 486 under DOS it's fine. It isn't my code that's doing it: it's the compiler or its libraries but is shouldn't be executing anything except itself at that point. The weirdness is that lack of consistency. It might go through, it might fail the moment it starts or anywhere between. A soft bug I'd expect to blow up after a specific time and probably as it's working on a particular source library - which it lists to screen as its going. My main concern was to check whether it would work (including run the result since it's an intermediate translating system) with NT. If it won't, it won't. I don't want my home stuff under NT anyway! Truth is, I don't like mysteries and 'General Error' is as bad as 'DOS Error 21' for informativeness.
  • Reserved (Score:1, Redundant)

    by norwoodites (226775)
    INT 0D is reserved according to my asm book.
  • Finally.. (Score:5, Funny)

    by Noodlenose (537591) on Saturday August 24, 2002 @10:29PM (#4135232) Homepage Journal
    It had to happen one day:

    This is officially the first /. post I don't understand.

    At all.

    Damn..

  • First things first (Score:5, Insightful)

    by ObviousGuy (578567) <ObviousGuy@hotmail.com> on Saturday August 24, 2002 @10:55PM (#4135300) Homepage Journal
    What compiler?

    What is crashing? The compiler? The command prompt?

    What are you doing when it crashes?

    Does this happen with other compilers? Other programs?
  • by Jon-o (17981) on Saturday August 24, 2002 @10:57PM (#4135306) Homepage
    I know next to nothing on the subject, but when I was tinkering about back in the good ole' DOS days, I came across this list of interrupts: http://www.delorie.com/djgpp/doc/rbinter/

    I expect most people have seen it. It lists the following fod 0d:

    0D INT 0D C - IRQ5 - FIXED DISK (PC,XT), LPT2 (AT), reserved (PS/2)
    0D INT 0D C - IRQ5 - Tandy 1000 60 Hz RAM REFRESH
    0D INT 0D - HP 95LX - INFRARED INTERRUPT
    0D INT 0D C - CPU-generated (80286+) - GENERAL PROTECTION VIOLATION
  • by droyad (412569)
    Int 0D is for:
    IRQ5 of 8259 (reserved for hard disk XT)

    The 8259 is the onboard interrupt controler, so basically an interrupt on IRQ5 is occuring and windows doesn't know what to do with it, cause something is wrong.

    Check technet.microsoft.com it's the first place to look regarding windows
    • Check technet.microsoft.com it's the first place to look regarding windows
      I didn't think Microsoft's technet was that out of date.
      The XT, for you youngsters out there, was when they added a whopping big 20 meg hard drive to the pc. It's before Intel made the 80286.
      With the AT (80286) they added a second interrupt controller, accessable by 16-bit cards and moved the hard disk interrupt to IRQ14 (primary) and IRQ15 (secondary) IDE controllers. IRQ5 now standard for lpt2 but somewhat avoided because of conflict with hardware interrupt 0Dh on 80286+, the famous General Protection Violation.
      Check google.com when you actually need useful information.

      • Well that clears the confusion about which meaning the interrupt has. Assuming that is that it didn't change after the 286 (Did anyone ever see a 286?). But WTF is 'General Protection Error' asposed ta mean? All I know is that it comes up on faster machines too but not as often and there is no regularity so it does look more like something machine side than software side. It's a mystery but I'm not losing much sleep over it because I'm only using something I know in the hopes of quick&dirty and also putting it through a C++ translator to familiarise with learning that horror.
        • From 80386 Programmer's Reference Manual.
          General Protection Exception
          All protection violations that do not cause another exception cause a general protection exception. This includes (but is not limited to):
          1. Exceeding segment limit when using CS, DS, ES, FS, or GS
          2. Exceeding segment limit when referencing a descriptior table
          3. Transferring control to a segment that is not executable
          4. Writing into a read-only data segment or into a code segment
          5. Reading from an execute-only segment
          6. Loading the SS register with a read-only descriptor
          7. Loading SS, DS, ES, FS, or GS with the descriptor of a system segment
          8. Loading DS, ES, FS, or GS with the descriptor of an executable segment that is not also readable
          9. Loading SS with the descriptor of an executable segment
          10. Accessing memory via DS, ES, FS, or GS when the segment register contains a null selector
          11. Switching to a busy task
          12. Violating privilege rules
          13. Loading CR0 with PG=1 and PE=0
          14. Interrupt or exception via trap or interrupt gate from V86 mode to privilege level other than zero.
          15. Exceeding the instruction length limit of 15 bytes (this can occur only if redundant prefex are placed before an instruction)

          Basically, the machine code is trying to do something highly illegal. How it got there and why are a different matter.
          Flaky memory is always a suspect.
          Computed jumps based on leftover garbage (uninitialized variable) are another fun way to encounter the problem. Random code/data usually crashes eventually.
          It is possible that it's just catching an attempted write to protected storage.

          The Intel 386+ actually does have a very good hardware protection mechanism, which unless sombody managed a port of Multics, is effectively unused and subverted from protected segments to a nice flat space where anybody can do anything to everything.

  • by Anonymous Coward on Saturday August 24, 2002 @11:26PM (#4135384)
    Int 0Dh is General Protection Fault, issued by the processor when illegal instructions or memory accesses are encountered. It's likely your compiler is catching GPF's instead of letting them pass on to Windows where you would get the generic "This program has crashed...blah blah" message. The interrupt could be caused by bad software or bad hardware. Gcc randomly crashes with the same interrupt on bad hardware, normally bad memory or processor cache.
  • Crappy hardware (Score:4, Insightful)

    by cperciva (102828) on Saturday August 24, 2002 @11:35PM (#4135403) Homepage
    Let's see... you have unexpected protection faults, you're running on antique hardware, and when you try the same code on a different machine, it works fine.

    That sounds exactly like the symptoms of hardware which has exceeded its MTBF.
    • It doesn't always work fine, just fewer of these weirdies on Pentium 3. I think it's probably incompatibilities with NT but it's strange that there's no consistency as to when it happens or whether it goes through OK. If it won't run on NT then it won't. But I'd love to know what's going on there. On really antique H/W - 486 under DOS it's fine. I expect it is Windows 95/98 compatible but not NT.
  • I'm sorry sir, but Slashdot does not support that software, please call your OEM for further help.
  • You have a "200MHz Pentium II" and are getting unexplained errors. Perhaps I missed something, but the first Pentium II was 233MHz.

    Maybe the problem is that you don't know what you're doing.

    Either that or I'm a half-drunk asshole. Either answer wouldn't surprise me.
    • by realgone (147744) on Sunday August 25, 2002 @08:29AM (#4136289)
      Maybe the problem is that you don't know what you're doing.
      Either that or I'm a half-drunk asshole. Either answer wouldn't surprise me.
      We should also consider the possibility that you're half-drunk *because* you don't know what you're doing. I mean, come on -- you posted this at 1 a.m. and you *still* weren't fully hoisted yet? What were you drinking, Tequiza or something?

      Get with the program, people...

    • Not true. I had a Pentium II 200 that technically was a factory overclocked 180 MHz.

      • 180MHz Pentium Pro, not Pentium II

        This guy might have a Pentium MMX 200, hell he could have a K6-450, it doesn't seem like he knows what he's talking about.
  • by Anonymous Coward
    It's probably a memory error... page fault or overflow or something similar... I think, based off the similar (borrowed) underpinnings taken from OS/2 for command line, Interrupt 0D errors are the same as OS/2's Trap 0D errors... each are error interrupts, same error code, different way of "naming" them (Trap/Interrupt).

    It's cause usually by the application. A look at WinNT error docs that should come with their older compilers, should turn it up, or a look at OS/2's Trap Explanation help file (or whatever it's called).

    - Rob
    www.WebBinaries.com

  • by Eneff (96967) on Sunday August 25, 2002 @01:25AM (#4135663)
    0D is often hardware.

    That's why it works on the other computer.

    You have three options.
    A. hope it's some sort of HD corruption and it's just windows being stupid. cheapest. Do a full scandisk on it, and see if it's having trouble. if it's not...

    B. Replace the memory. Memory gone bad isn't pretty. If *that* doesn't help,

    C. Throw it out the window, because you probably have some sort of motherboard or other bugs you just don't want to diagnose.

    And thank you for calling Microsoft Technical Support. Do you want the bill on Visa, Mastercard, or Discover?
  • 0D (Score:5, Informative)

    by adolf (21054) <flodadolf@gmail.com> on Sunday August 25, 2002 @03:00AM (#4135830) Journal
    It's not an NT error, but an Intel one, dating back to the Beginning of Time (or the 6MHz 286, anyway). The same errors are reported in the same way under OS/2, and probably a number of other operating systems - I seem to recall Win95 puking out similar nomenclature during at least one BSOD.

    Under OS/2, such screaching halts are known as "traps," instead of blue screens. And since OS/2 users were generally more knowledgable about computers then, than NT users are today, there's a lot of information available to help with fixing it.

    According to groups.google.com [google.com]-archived message from 1993, 0D is a General Protection Fault.

    GPFs happen all the time with bunky hardware. Try re-seating (or just purchasing new) RAM, CPU, and anything else socketed that you can find.

    And if that doesn't work, toss the machine. Or give it away to someone with stubborn enough to fix it. Different boxes of similar ilk are available in the $50 range, these days - no need to spend any absurd amount of time with a diagnosis.
    • Re:0D (Score:3, Informative)

      by Spoing (152917)
      In general, you're right;

      1. Int13 (hex 0D) is an Intel CPU generated error code. (Don't shoot the messenger -- the CPU reports the violation and is very very rarely the reason for the failure.)

      2. If the same software works on one machine but does not work on a similar machine it's often not worth the time to find out why it's failing. (Good guess: it's probably faulty hardware -- dammaged or designed broken.)

      In addition...

      3. Int13 can be caused by faulty hardware or software. Bad software usually wins the coin toss. Since it happens in this case while using a compiler, I'd say software is the likely cause -- the compiler or (hate to say) your source.

      4. Only occurs when the processor is in protected mode. Simply stated; you've got no process isolation in an Intel processor's initial mode at boot time, in DOS (not a command prompt) and while in the system BIOS (aka "real" mode).

      5. Protected mode enables the Intel MMU (memory management unit) and requires a program (usually the OS) to manage the GDT (general [memory] descriptor table).

      6. If improperly managed by the GDT control program, processes can bleed into other areas. A proper response by the OS to violating and attempting to modify/read areas it is not allowed to use is to close the process and flag the error.

      7. In quite a few situations, violations (int13 and otherwise) are OK and expected. These violations are used to trigger responses such as virtual memory page swapping and interrupt handling. Anything outside an expected violation may point to hardware failure, software corruption (by an errant program), or

      8. Failures that happen on the OS level can only be cought _after_ the violation _as_long_as_ the process does not nuke critical parts of the OS or the GDT. This means that a violation that is announced usually means your system is in a suspect (possibly instable) state.

      9. This is why few things should run as extentions to the OS (ring 0) and should be run at the user level (ring 3).

      Rant: Video and other hardware drivers should never run at the OS level let alone other programs that are not part of the OS that specifically is designed to manage memory and other core system hardware. Limited and focused use of OS level resources is a necessity -- because if the OS is corrupted, all bets are off including sane int13 handling

      • by Spoing (152917)
        Correction: GDT = Global Descriptor Table. It's been a while since I've delt with this.
      • Thanks. It's not because of old hardware: just the reverse. It is worse on the test machine but it happens on the P3s as well. There may well be bugs in some of the source libraries. I've fixed the ones that actually threw compilation errors. From what you say is sounds like the compiler should be handling this and isn't. It does run in Protected. What puzzles me is the inconsistency but I'm getting a picture that possibly the code isn't fast enough to catch all the interrupts on this machine while on slower ones it does and on faster some of them get just plain lost so fewer get through to throw it out. I'm hoping to get a Linux fixed anyway and there is one compiler, one Algol68-to-C translator free for that. My original intention was to get stuff working and run it through the A2C to get an idea of the various C (or rather C++) libraries involved in standard work. I find C/C++/Java messy, inconsistant and feeble but they happen to be what's in use. Somehow you don't see that many adverts for Eiffel, Ada, Mod3 or other exotica just like nothing ever stood a chance against Fortran and Cobol however much better.
        • by Spoing (152917)
          You're on the right track; if the code depends on hardware events, you have to deal with timing issues.

          Another frequent reason is memory offsets. A slight difference on similar hardware (or with different drivers or software) may allow one system to 'work' (it's corrupting or accessing it's own address space -- BAD), or 'fail' (you get an int13 or other error -- actually a good thing; you are told something is wrong).

          This is not an exhaustive list. Happy hunting...

  • I believe the slowest Pentium II ever made was 233MHz. Perhaps the problems you're having are somehow related to the fact that a 200MHz Pentium II never existed...? :)
  • Do you have an intel Nic? replace it. I found most BSOD to be from Intel nics. In linux they work fine, but in NT they just die at random.
  • Try your programs/compilers on a machine that uses Registered ECC memory. You'd be surprised how many single bit memory errors can occur, especially when the internal case temp of your 'puter gets high, and also when the memory is getting old (as it clearly is in a 200mhz machine).

    If you do not have such hardware available, try just swapping out the RAM in that machine for new memory and see if the problem goes away.

    BTW I picked up a new SuperMicro DLI motherboard (dual P3) w/ ServerWorks chipset and ECC memory mandatory from ebay for $58 bucks.

    Particularly on a software development machine, having ECC memory can prevent you from chasing odd bugs that are seemly random (at least ones that would be due to memory errors).

    Or maybe we're both crazy.
  • I don't know if this [bitwizard.nl] will directly address your problem, but I found it helpful once for diagnosing a bad FPU. There's lots of good tidbits talking about bad hardware and its symptoms.

Blessed be those who initiate lively discussions with the hopelessly mute, for they shall be known as Dentists.

Working...