Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Hardware

Learning x86 for Non-x86 Assembler Programmers? 64

An anonymous reader asks: "I've done assembler for the 6809, 68000, 8085, MIPS and ARM architectures over the years. But - I've never learned assembler for the most common architecture out there. I would like to change that. I can roughly follow my way around x86 disassemblies, but I'm not as good at optimizing/fine tuning bits of assembler because I am not intimately familiar with all of the addressing modes etc. I would like a book that is targetted at people like me. I would like to be able to fine tune, say a blitter in x86 assembler. One thing I do not in a book is something that is trying to teach me assembler programming in general. Most assembler books seem to fall in the latter category. Are there books out there that might prove useful to me?"
This discussion has been archived. No new comments can be posted.

Learning x86 for Non-x86 Assembler Programmers?

Comments Filter:
  • Brute force it (Score:1, Redundant)

    by repvik ( 96666 )
    It's simple! Just use a semi-intelligent program to generate random opcodes, and see which ones compile, and which ones actually run!

    Joke aside, what about checking Intel's pages? I seem to remember there were quite a few documents there on assembly programming...
    • Intel's pages (Score:2, Informative)

      Thats a great suggestion, all the low level documenation is downloadable from Intel's site in PDF format.
      You can also request a free CD with everything you'd need on it.

      AMD also have documentation up on there site and I think the x86-64 documentation is also available for free on a CD.
  • Helpful website (Score:3, Informative)

    by bdash ( 598142 ) <slashdot DOT org AT bdash DOT net DOT nz> on Friday September 06, 2002 @06:08AM (#4205409) Homepage
    A website that could come in handy for learning about x86 assembly language is DDJ Microprocessor Center [x86.org]. In specific, the On-line Intel Documentation [x86.org] links are almost invaluable when learning to code for the x86 architecture. Being Intel reference manuals, they tend to cut to the case relativly quickly.
  • Assembly on a proccessor that runs 2.5GHz? Isn't that a bit like calculating the trajectory of every piece of dust in a desert in a five mile radius of a nuclear blast, by hand?

    Asm is great to understand, which you already do, and it's essential for certain applications... modern x86 proccessors, however... you don't build modern jets with slide rules "Standard" measurement units... you get caught in the details and lose sight of the project.

    • There are compelling reasons to use assembler in various situations.
      Granted, you wouldn't write a word processing suite in assembler, but for graphic routines, graphics libraries, maths and 3D processing and similar tasks that use much processing power.
      Take a look at programs like dnetc @ distributed.net [distributed.net]. I wonder why it's not programmed in Visual Basic, but actually in assembler (The core, atleast). I'd like to see dnetc and seti@home programmed in a high-level language, and still be efficient.

      • Granted, you wouldn't write a word processing suite in assembler,

        Shhhh! Don't tell that to these guys:

        In October 1978 [...] Barnaby began coding WordStar. In four months, Barnaby wrote 137,000 lines of bullet-proof assembly-language code. Rubenstein later checked with some friends from IBM who calculated that Barnaby's output was equal to 42 man years. [link [ucsb.edu]]

        ...or...

        For the past three years, [Jeff Wilson] has been employed by WordPerfect Corporation as a software engineer. While there, he participated in development of WordPerfect for the Apple IIe/IIc computer line. He is currently managing development of WordPerfect for the Atari ST, which should be available shortly after you read this. He programs exclusively in assembly language, and enjoys it! [link [atarimagazines.com]]

        Also, from what I understood, the WordPerfect Corporation actually required that all programs be written in assembler. BTW, some more interesing WP history. [fitnesoft.com]

        Oh, you mean, you wouldn't use assembler to write a word processing suite nowadays. Ok, I getcha. Yeah, I think you're right. After all, WordPerfect Corp has been out of business for how long? (Well technically, bought out and resold, and resold... They're just a name now.)

        --Joe
    • You underestimate the speed gains to be had (up to a factor of 10 in certain cases) by replacing computationally intensive floating point code with SSE or 3DNow! instructions. Assembly Language will always be relevant in these cases. No compiler exists which can exploit SIMD architectures to the full on commodity hardware. http://libsimd.sourceforge.net
    • Is equivelent to a 250Mhz processor with 128MB of ram, running only code writen in ASM.

      ASM can often make applications hundreds of time faster so you 2.5Ghz processor running Joes bloaty crapware is like my P100 running ollies carefully crafted and optimized assembler.
    • OK, break out your favorite editor. Write a "Hello World" program in C. Compile, tell the computer to optimize the sh*t out of it. Now, write the same program in ASM. I can guarentee you the ASM version will be about 5x smaller then the compiled one. And Smaller == Faster.
      • Actually, with a modern compiler like gcc 3.2, this might not be true anymore. For relatively simple programs like "hello world", many modern compilers can optimize compiled code to the point where there is no difference from writing the program from hand in asm. I've tested this myself with earlier gcc3.x versions (actually, even before 3.0 came out).

        In this age and time, asm is particularily suited to computationally intensive portions of programs. Yes, it's probably not going to be faster if you write the whole thing in asm, but it'll not only be apt to be bug prone, but the development time will likely take longer.
        • Ah, but can GCC3 compile a binary that takes up only FORTY-FIVE BYTES [muppetlabs.com]?

          Didn't think so. :)

          • *bookmark*
            Oh my $DEITY, that is sweet, Thanks for posting that.
          • Every time I see that article I have to point out that it's easy to do much better than that.

            $ ./a.out; echo $?
            42
            $ wc -c a.out
            7 a.out
            $

            Can you guess the contents of a.out?

            ...

            ...

            ...

            $ cat a.out; echo
            exit 42
            $

            (the "; echo" is there because a.out doesn't have a trailing newline - every byte counts!)

            Stuart.
            • Of course, there's always this to worry about:

              $ wc -c /bin/bash
              530392 /bin/bash

              Which kind of defeats the purpose. :)

              • I don't see why you have to worry about that. The article was about minimizing the size of the executable file itself on disk, not in memory (otherwise some of the last few optimizations would be counterproductive: one of them increases the amount of memory requested in order to reuse a particular byte).

                If the size of bash is going to be counted towards my seven-byte program, you'll have to count the size of the kernel towards his 45-byter. Sure, I still lose because bash+kernel+7 is greater than kernel+45, but at least the ratio isn't so bad as bash+7 vs 45 ;) (besides, I claim I could probably find a combination of kernel and shell which total smaller than the linux kernel, and on which "exit 42" would still work, but his 45-byter is x86-linux specific)
                • The article was about minimizing the size of the executable file itself on disk
                  Correct. And whenever you run a shell script, the executable you're running is actually the interpreter. You could take that 45-byte executable from the link and run it (theoretically) on any Linux kernel that supports ELF. The "exit 42" script will only work on systems which have a shell which contains instructions capable of parsing the string "exit 42." Shell scripts aren't executables, and I feel that as such they should have certain handicaps placed on them when involving themselves in ridiculous contests such as these. :) It's really a matter of apples and oranges. The kicker, as we pedantically load the programs:
                  $ /lib/ld-linux.so.2 ./a.out.bin; echo $?
                  42
                  $ /lib/ld-linux.so.2 ./a.out.sh; echo $?
                  ./a.out.sh: error while loading shared libraries: ./a.out.sh: file too short
                  :)

          • "Of course, half of the values in this file violate some part of the ELF standard, and it's a wonder than Linux will even consent to sneeze on it, much less give it a process ID. This is not the sort of program to which one would normally be willing to confess authorship."

            LOL. This guy should edit everyone else's documentation. A good quick quide to the mechanics of getting started with ASM.
          • debug
            A
            xor ax,ax
            ret
            [esc]
            R CX
            3
            N tiny.com
            W
            Q
            tiny.com

            I can make one 3 bytes!
        • *points to the 45-byte guy who was here before him*

          That, is why ASM is better then any HLL. I think the best quote I got from one of my Computer Engineering book was (paraphrasing) "Modern compilers with their optimizations are on the road to becoming almost as good as hand writen assembler."

          Now, would I write word processor in ASM? Not bloody likely, HLLs make it much easier to do. But, when you are writing code for some type of embeded system that doesn't have a whopping 2 GHz processor, ASM will beat any HLL hands down. Unfortunately, too many people think ASM is dead, never learn it, write their embedded code in C and when it isn't fast enough, tell their supervisors that it needs a faster processor. Consider this scenario (stolen from one of my profs):

          • Coder writes embedded system in C.
          • Code isn't fast enough makes company buy faster processor
          • Each processor adds $10 to cost of said system.
          • $10 * 1e6 units = $10e6
          As opposed to this
          • Coder writes embedded system in C
          • Code isn't fast enough
          • Company calls in consultant
          • Consultant reads C, looks at the ASM it creates and spends one night tightening the ASM up.
          • Consultant head off to Florida for the rest of the week
          • At the end of the week consultant makes himself looked disheveled and stumbles in saying "It took all week, but here is the code, it will save to $10 per unit"
          • Consultant Charges $2e6, which will the company gladly pays, considering it saves them over $8e6.
          See, knowing ASM and how processor works is a good thing that can make you money (maybe not as much, but still a nice whopping amount for a few days work). ASM is still needed, and anyone who says different, does not understand how computers work.
      • cmp Smaller,Faster (Score:4, Insightful)

        by MarkusQ ( 450076 ) on Friday September 06, 2002 @12:01PM (#4206864) Journal

        And Smaller == Faster.

        Not always. While I have been known to drop into assembly, it should never be the first recourse when you are trying to speed things up. If it is, you are likely to miss out on the biggest savings. My rough priority list:

        1. Find some way to quantify how slow/fast the program is, and where. This might mean using a profiler, but it might not. Slice the data various ways (by high level task, by thread, by low level functions, by data structures accessed, by calling patterns, etc.)
        2. Look closely at the places where the largest chunks of objectionable time are being spent. Consider various refactorings, new algorithms, new data structures, etc. Also look at the customers of the routines in question, to see what their real needs are (e.g. is someone sorting a bazillion data items just so they can pluck the smallest/largest from one end of the structure, or are they recomputing a value that seldom changes) and consider other ways to meet these needs.
        3. Make some test modifications, and repeat
        4. Once you have a good understanding of what the program is doing, and are convinced that it is being done in the most way practical, calculate how long this should be taking. If the actual times are far above your informed estimate, then consider hand coding.

        -- MarkusQ

        • True, While it shouldn't be the first thing you do, Dropping into ASM to check things out /usually/ will yield a performance increase, in my experience. Also, learning assembly and such will show you how to write faster conditional statements (IIRC a "while a != b) yields better non-optimized assembly then "while a = b" [Could be way off my mark on it, but I was taught an example like that])

          As always, YMMV.
        • Smaller != faster if you generate a page fault because your data or a function weren't byte aligned then you app will run slow as a dog.

          Unrolled loops have a larger footprint but have a smaller execution path.

          It's always a good idea to try and make core functionality run compleatly in the CPU cache, if an unrolled loop takes the application core out of cache then don't unroll it, this is were hand assembled code can outstrip compiled code.

          If you wan't really fast code then write it in decent Java and have a profilling just-in-time compiler.

    • LAZY!

      Yours is the attitude that require word processors to need 40MB of ram to run!

  • Wednesday (Score:2, Informative)

    How about the book that was reviewed here on Wednesday? [slashdot.org]
  • Too many aspect (Score:3, Insightful)

    by e8johan ( 605347 ) on Friday September 06, 2002 @06:54AM (#4205496) Homepage Journal
    I would not recommend anyone to optimize modern x86 asm by hand. If you know your way around disassembled code you know enough to find any (rare) compiler mistakes. Any other operation is usually done better by a compiler. (Please don't yell at me with small, hand optimized special cases, compilers do a good job today if your application isn't very special.)
    If you would try to hand optimize asm code for a modern cpu you must concider many issues, among them the reordering of instruction in the processor, the different layouts of the pipelines in different processor models (even intels differ to other intels), cache effects (I suppose that you must link everything statically and control where in memory your code will end up)...
    You must also unroll loops, change the access patterns to 2D data structures to improve cache performance, avoid inner loop data dependencies, etc. It is simply too much to handle by hand.
    As you probably know a higher level language such as C/C++ and don't write a highly optimized operating system (or work without an OS) you do not need, and should not want to, optimize your asm code by hand!
    • Must be a troll

      Look asm is not needed to write M$ apps. It is needed for highly computative tasks. While Fortran does a good job in most cases sometimes it needs hand crafted asm to get the job done in time. When all you have is a hammer everything looks like a nail.

    • Uhmmm...

      Judging by the variety of platforms the poster has coded for, I would say it's likely that most of his assembler is done on EMBEDDED platforms, where assembly language IS still used fairly often and where optimizations WOULD come in handy.
  • by cookd ( 72933 ) <douglascook&juno,com> on Friday September 06, 2002 @07:19AM (#4205533) Journal
    Shh! It's a secret, but Intel offers 4 very nice books [intel.com] at a great price: free.

    They aren't tutorials, so there isn't the same hand-holding that you would get in a book from Barnes & Noble, but they explain things well enough that a seasoned assembly programmer should be able to follow with no problem at all. I think they are exactly what you want.
  • by LordNimon ( 85072 ) on Friday September 06, 2002 @08:24AM (#4205693)
    Michael Abrash's The Zen of Assembly Language. [amazon.com]

    I'm surprised no one has mentioned this book already, because it's exactly what you're looking for. The only problem is that it's dated - it considers the 80386 to be a new processor. There was a time when no self-respecting assembly programmer would be caught dead without it. Alas, I sold mine a couple years ago, since I already learned everything I could from it.

    The only problem is that it (like all of Abrash's books) has been out of print for a long time, and so it will be very hard to find.

  • Zen of assembly programming, might be worth checking out... However books/articles are pretty much out of print..

    Most recent printing would be in Graphics programming black book, but that too has lapsed out of print.

    Slashdot article a while ago linked to a download of the complete text, ie Slashdot [slashdot.org], but doesn't seem to link there anymore, perhaps someone else would have an idea where the find the download.


  • So you've been doing assembly for well-designed modern architectures and now you want to learn it for that bit of kludgery we call "x86"? My advice is to run your brain through a kitchen blender and then pour it back into your head. The x86 architecture may start making sense after that.

  • Someone mentioned "Zen of Assembly Programming" by Michael Abrash. I can recommend "Zen of Code Optimization", also by Michael Abrash. The latter book might be more to your taste, since it focuses on the fine-tuning rather than assembly language itself.

    However, the book is somewhat outdated since it only includes processors up to the Pentium I w/o MMX.

    On the other hand, today's processors are so advanced at doing their own optimization (out-of-order execution, branch prediction, R-OPs, and whatnot), and the different processors do it so differently (AMD vs. Intel) that hand-optimization has far less impact than in the olden days. So you might want to consider if you want to do fine-tuning, except for the obvious (unrolling loops, quadword-aligning your data, to name two things). Also, I think a far greater speed-up can be accomplished by optimizing the algorithm you use. Putting some thought into the algorithm can do wonders, and Abrash agrees with me in "Zen of Code Optimization".

    HTH,
    Michael
  • For Linux (Score:4, Informative)

    by cjpez ( 148000 ) on Friday September 06, 2002 @09:36AM (#4206000) Homepage Journal
    If you're in Linuxland, you might find linuxassembly.org [linuxassembly.org] helpful. I've done some assembly before (only a semester's worth, though), and the site was rather useful to me. If you've never done x86, though, there might not be enough there for you . . .
  • One assembles assembly with an assembler.

    You don't code in "assembler" language, you code in "assembly" language.
  • by Anonymous Coward
    It might not be in print anymore, though I found it invaluable when I had to learn how to debug system BIOSes.

    The Visible series not only teaches assembly, it does so from a novice perspective and includes an emulator to try out the new commands.
  • by Anonymous Coward
    You sound like a real programmer - What are you doing reading slashdot? Slashdot is a site for posers, wannabes, and sysadmins.
  • Since its optimization you are concerned with I have a few choices you will be interested in:

    1. The Zen of Code Optimization by Michael Abrash. [amazon.com]
    2. Agner Fog's Assembly Resources [agner.org]
    3. The Athlon Optimization Guide [amd.com]
    4. Intel's IA32 Optimization Guide [intel.com]
    5. The Aggregate Magic Algorithms [aggregate.org]

    These sources will give you everything you need to know about code optimization for x86.
  • I have done a little programming in x86 assembly and I must say that it has given me an exelent understanding of how the processor works. I think it's very neat. I also find programming in assembly fun! Yeah... I'm crazy. I agree. The person who said put your brain into a blender and pour it back into your head was right... You'd have to be nuts to WANT to program in assembly for normal tasks. Well, I'm nuts =) WEEEEEEE!

    But anywho, to answer your question, use the internet... It's invaluable. Fire up good ol' Google and start searching for x86 assembly or x86 ASM. There are plenty of resources out there including lists of opcodes: what they do, how to use 'em, interrupts, even lists of ASM->machine code translations... Just in case you ever wanted to know that 26D26DFF is shr byte [es:di-1],cl. Take a look at NASM (http://www.cryogen.com/Nasm) if you are looking for an assembler... It's free and works nicely =) There is huge list of interrupts (http://www.pobox.com/~ralf, plus a Windows program, http://www.via.nl/~dms/freeware/rbilviewer/, to read the lists and display them all pretty).

    Hope this helps. (No URLs have been checked, but I doubt they've changed)
  • best x86 resource (Score:3, Interesting)

    by green pizza ( 159161 ) on Saturday September 07, 2002 @12:14AM (#4210998) Homepage
    http://grc.com/smgassembly.htm [grc.com]

    Yep, Gibson writes gui Win32 windows apps in pure x86 assembly. He's nuts, but his apps are tiny and run fast. Lots of good resources there.
  • Yet another book you might check out is Michael Schmitt's "Pentium Processor Optimization Tools", which is a 200-plus page textbook with handy references to instructions, etc. in the appendices.

    It's from 1994 and thus the pre-MMX, pre-P6 world, and it is not especially well written (although not bad), but there is a good discussion of segmentation and of which instructions pair in the Pentium's different pipelines, and it really walks you though optimizing a bunch of code.

    I think some of it might well be useful relative to the ten bucks it's going for on half.com. (Actually, I found it remaindered for $4.00 at the Cambridge, Mass Micro Center, so you may do a little better too.)

Make sure your code does nothing gracefully.

Working...