Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Linux Software

Recruiting Help in Smashing Kernel Bugs? 22

Orm asks: "As we all know, Linux version 3.0 (or 2.6) is about to get a feature freeze, meaning that it is bug hunting time. And as we saw on the kernel-summit, some people are afraid that not enough people are willing to try this new kernel, to help get those bugs smashed. My idea to get more people into this hunting, is to write a good paper on how to do it. Not a standard 'how to locate and fix bugs' but a document targeted towards people using 2.4.x today, who will want to help out. So what things should one need to know when testing a new kernel?"

"First: What is new? When I am running menuconfig/xconfig, is there something new I should look into? Will the old /dev directory be replaced with the new devfs-magic?

Second: What needs testing? I guess this is hand in hand with what is new, but you never know. The non-kernel-dev people may not know everything that has happened since 2.4.x., and there may be particular features that should be focused on more than others.

Finally: How do we do it? How should we test? What are the best ways to localize the bugs? How should we write a bug report? Whom should we send it to?

You do want our help, don't you?

I do hope to see this document at the same time as the feature-freeze. I also hope it will be a well-written piece, so many will feel the 'urge' to test the new kernel and give good feedback."

This discussion has been archived. No new comments can be posted.

Recruiting Help in Smashing Kernel Bugs?

Comments Filter:
  • I Tried 3 different config options (removing parts that seemed to cause problem compiling), then tried the -ac patch, but I always ran into a compile-time error.

    I don't submit those, as I suppoze they will be reported anyway (and I only tried to compile it a week ago).

    So, to make me at least test the kernel: make it compile:)
    (I'll give 2.5.45 a try, never before had this much compile-time problems with any kernel)

    • Ditto.

      I can't test software when the basics fail. I've not gotten 2.5.* to compile for even my most basic setup.

      Features that are known to fail (obviously if they don't compile) should not be sent to testing.

      Joe
      • well, then that's your bug-testing. report compilation failure. heck, make oldconfig debians .config, and send all failed things.
        • by pruneau ( 208454 ) <pruneau@gmail . c om> on Wednesday October 30, 2002 @09:42AM (#4563585) Journal
          Me too !

          The problem is that it's not a easy-to-track down feature into a remote far-not-likely-to-be-used-anyway-freaking-mysterio us-device-driver. No, it's something into the basic kernel code (/usr/src/linux/kernel/ or /usr/src/linux/include/linux).
          I know enough of C/compilation to locate the problem and even attempt to quick-patch it, but I do not have the knowledge (nor the time, I'm afraid) to correct that and submit a patch by myself.

          The point is, short of posting that on the kernel mailing list (which maybe I should do), is there a better way to get around that ? I'm quite willing to help, drat I tested about 7 to 10 .config file before giving up, but what in hell can I do !?!

          I mean I know, the open-source model is suppose to work in a way where I should try hard to figure things out, but hey, if they want a broad testing audience, they cannot force everybody to learn the kernel.

          There should be a middle ground here, should'nt it ???

  • by Sherloqq ( 577391 ) on Wednesday October 30, 2002 @09:10AM (#4563423)
    1. Make sure software compiles! (judging from some replies, even that can't be taken for granted...)
    2. Make sure software runs! (ditto)
    3. Make sure all the functionality you expect from the kernel is there (i.e. if you compiled a driver for a network card, it better work and be backwards compatible, unless it's not supposed to anymore)
    4. Make sure the software is stable (test the bejeezus out of the features -- if your cdrom or a scsi card requires a particular setting to work, or there are three ways for you to reference a device, test all of them (or at least the ones you care about)
    5. If you encounter any bugs (related or unrelated), report them in enough detail for them to be reproducible. Keep it to the point and relevant to the topic. Use spell and grammar checkers :)
    6. When updates/bugfixes come out, lather, rinse, repeat.
  • Testing 2.5. (Score:4, Informative)

    by heikkih ( 100839 ) on Wednesday October 30, 2002 @09:49AM (#4563624) Homepage
    How about checking out Dave Jones' introduction to 2.5? [codemonkey.org.uk] :) It will probably be updated as things move along, most notably is LVM2 already included.
  • by crazney ( 194622 ) on Wednesday October 30, 2002 @10:07AM (#4563755) Homepage Journal
    I had 2.5.34 or something compiled and going. But unfortunatly I wasn't savy enough to hack the nvidia interface to compile against it. So I went back to 2.4

    Maybe if one of the kernel hackers could spend an hour or so getting the nvidia drivers to compile without unresolved symbols, itd open the door up to alot of power gaming users and developers (eg myself).

    Even though the NVidia drivers don't come with source, there is one source file which is what gets compiled against the kernel - I got as far as getting it to compile, but unfortunatly since the device module interface has changed somewhat there were unresolved symbols that I didn't know howto fix (though all the unresolved symbols where imported from the source file - so it is fixable).

    craz
    • Try these [starman.ee] patches as mentioned here [theaimsgroup.com].

      Following the linux-kernel list is useful sometimes :)

    • NVidia.There's a love-hate thing.

      I've got a Linux box with an nVidia binary driver that I use at work that's 99.9% great under 2.4.

      Functionally, it's a great thing.

      But I always see the same old exchange on the kernel mailing lists, as people can't get new kernels to work with the binary driver.

      Then, someone like Alan Cox usually replies tersely that since they can't see the code for nVidia driver, that they can't help fix the problem. The kernel developers are looking at a black wall.

      Meanwhile, I think the nVidia folks use code in their drivers that is encumbered by patents, NDA's , competitive advantage, so they simply won't release the code for the binary driver.

      A stand-off, I guess. As long as folks at NVidia update their drivers I'll be fat and happy. If ever they don't, I'm totally hosed.

      So: are the GPL'd nVidia drivers any good?

    • You can't contribute anything. Any problems you may find may be a result of the NVidia driver not behaving well with 2.5. The LK team has no interest (and I maintain that they should have no interest) in maintaining binary compatibility for modules between major versions.

      So, it's probably better for the LK developers if the NVidia driver *dosn't* compile as they won't have to sift briken NVidia out of the meangingful reports.
      • The LK team has no interest (and I maintain that they should have no interest) in maintaining binary compatibility for modules between major versions.

        If only that was true. 'Problem' is, they don't even maintain binary compatibility for modules between minor versions.
    • I hate to be a drag, but not being able to be on the bleeding edge is simply the cost of using closed source software on an open source system.

      This is why I didn't bother buying anything much for Linux that requires me to use binary drivers.
  • by jgardn ( 539054 )
    A lot of linux users don't know or have forgotten how to install a new kernel. A long time ago, the installers that came with the distributions couldn't hide much of that process from you, but nowadays, you can click through a couple of pages and get a working system.

    Simple instructions on how to take your working Linux system, allocate a couple of gigs of your hard drive for testing, install your favorite distro on top of that, and then replace the kernel with the 2.5 one would go a long ways to getting people to try it out. Also, some pointers on what to look for while testing would be useful, and perhaps instructions on where to report problems so that they get handled would be nice. Where can we find this information if it already exists?

    And isn't there some way to test a linux kernel without rebooting? I have heard of something like this, but it has been so long I don't know any of the details. It would be useful if someone could explain / point to that as well.

    In short, it's not that we don't *want* to help, it's that we don't know *how*.
    • Here's how I test new kernels.

      Get the latest source from here [kernel.org].
      As non-root user, create if needed
      mkdir ~user/src
      cd src
      cp ~/tmp/linux.2.5.44.tar.gz .
      gunzip -d linux-2.5.44.tar.gz
      tar -xvf linux-2.5.44.tar
      mv linux-2.5.44 linux-2.5.44-xxx#
      # use a unique xxx# (example: yih1 for Your Initials Here 1)
      # If you do another build for 2.5.44, then use yih2, etc
      cd linux-2.5.44-yih1
      # edit the Makefile, and on the 4th line make it
      # appear as: EXTRA_VERSION = -yih1
      # if you have a .config from a prior build then
      cp ~user/src/linux-2.4.18/.config .
      # I used 2.4.18 as an example. If you've never
      # built a kernel before, you may have a .config
      # in /usr/src/linux but it will likely be a
      # very full one that will build stuff you don't need
      # if you have an old .config
      make oldconfig
      # that will add new CONFIG_* items that are needed
      # note that you'll have to Return your way through
      # that to add the new items in a disabled state
      # next, whether you had an old .config or not, run
      make menuconfig
      # go through this from top to bottom.
      # it can be confusing at first, but at a
      # minimum, get your processor type correct
      # Use the help function
      # one item you definitely want to enable is
      # your IDE or SCSI stuff obviously.
      # sorry, I can't provide more guidelines here
      # it all depends upon your hardware. You may end
      # up with unusable kernels until you get it correct
      # which can be very frustrating indeed.
      # but once you have a working .config, you'll
      # always be able to use that as a base for future builds.
      # continuing, still as non-root 'user'
      nohup nice make dep clean bzImage modules &
      # wait until it's done. I monitor the build via
      tail -f nohup.out
      # assuming the build is clean[1]
      su # no dash, you want to be root but in the 'user' environment
      # a pwd here should be ~user/src/linux-2.5.44-yih1
      cp arch/i386/boot/bzImage /boot/2.5.44-yih1
      # substitute i386 as needed for your platform
      cp System.map /boot/System.map-yih1
      make modules_install
      # edit your /etc/lilo.conf and add a new boot item, and run lilo
      # if you use grub, do the grub stuff
      # make sure you keep your current kernel boot item in case the new kernel craps out
      reboot

      note that you don't build the kernel as root, nor do you put it into /usr/src/linux.
      Only the install of the kernel is done as root.
      All of the above allows you to keep multiple kernel trees, and allows you to manage them properly.

      [1] - if the build is not clean, try turning stuff OFF in make menuconfig
      Anything new of course may cause compilation failures

      I hope that helps, that was all of the top of my head. I'm sure others can point out extra items to consider.

  • by FattMattP ( 86246 ) on Wednesday October 30, 2002 @12:42PM (#4565062) Homepage
    Where does one go to see the list of open bugs for the kernel or to file a big report? Is there a bugzilla for the kernel?
  • benchmarks (Score:2, Interesting)

    Two reasons will exist for people to migrate to the new kernel:

    • new hardware support
    • better performance

    I will let others fight over the hardware support. We have seen a lot of memory manager changes, some changes to the IDE code (although a lot was reversed), and, if I'm not mistaken, some zero-copy improvements on the network layer. It's time we found out if there were actual improvements.

    Get code that works against, say, 2.4.18 (Mandrake 8.0 and RH7.x if memory serves) and against each of the new kernels. Assuming no major kernel changes go out this time, we just get testers to run the benchmark suite against their current configuration and then against their new configuration until we get the new kernels working positively.

    Then the benchmark program(s) dump all relevant data to an XML file and we build a little database whereby people submit their performance statistics. Of course, the XML would need signed to prevent tampering.

    Then, all the news junkies can climb over the performance improvements made in the new kernel and we can have issues like which came up in the early 2.4.x series avoided.

The rule on staying alive as a program manager is to give 'em a number or give 'em a date, but never give 'em both at once.

Working...