Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

How Mainstream Can Code Scavenging Go?

Posted by ScuttleMonkey on Sat Dec 01, 2007 12:13 AM
from the unwanted-easter-eggs dept.
The time-honored tradition of code scavanging has long been a way for new programmers to "break in" to a new language or task that they may not want to build from the ground up. The re-use of old code, cleaned up and tweaked to a new purpose can help developers learn many useful skills and accomplish tasks quickly, especially for small tasks that aren't of vital importance. One blogger wondered if this process could be formalized and tools could be built to help foster and enable code scavanging on a mass level. Is this a viable option, or are there just too many things to consider?
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • IP Laws? (Score:4, Interesting)

    by gambit3 (463693) on Saturday December 01 2007, @12:18AM (#21541585) Homepage Journal
    So, how quickly would you run afoul of Intellectual Property laws doing this?
    • I want to be the first to welcome our new GPL overlord to the commercial software world.
    • Re:IP Laws? (Score:5, Insightful)

      by caffeinemessiah (918089) on Saturday December 01 2007, @12:24AM (#21541613) Journal

      So, how quickly would you run afoul of Intellectual Property laws doing this?

      That's a great knee-jerk reaction. Without understanding the motivation behind the article, you assume that code scavenging means stealing other people's code. What they're really talking about is (legitimately) re-visiting code that you or other people have written, and then picking and modularizing bite-sized chunks. In other words, you would design a large program (mark I) and then go back and pick out useful parts, clean/debug them and have working modules (mark II) for the next project.

      Also, for people who haven't read TFA, it's 9 short paragraphs long and barely an article. They talk about a "formal approach to code scavenging" without even coming close to explaining what exactly that MEANS.

      • You write program A ... eventually you refactor it and turn parts of it into cleaner modules.

        You can then use those modules in other programs.
        • You write program A ... eventually you refactor it and turn parts of it into cleaner modules. You can then use those modules in other programs.

          short answer: yes. and even before "refactoring" came into vogue, there were other names for it. hence, TFA is not really a FA.

      • Re:IP Laws? (Score:5, Interesting)

        by sootman (158191) on Saturday December 01 2007, @12:56AM (#21541785) Journal
        They talk about a "formal approach to code scavenging" without even coming close to explaining what exactly that MEANS.

        Agreed. Reading TFS, I thought it was going to be yet another "we can make programming like Lego!" thing. (Which it ain't, and probably never will be. [joelonsoftware.com] Bonus reference: "Lego" is mentioned in the second paragraph of this article [wired.com] about Steve Jobs/NeXT/WebObjects from Wired. God bless Wired and their eternally fucked-up CMS that can't serve images for any story in the archive and, this week, shows the actual HTML code that should be formatting the Question-and-Answer portion of the article.)

        Reading TFA, I really don't know much more than I did before. This is the best I could come up with:

        Code scavenging is seen as the most frequent and least complex way of re-using code and has been common practice at an informal level since programming first began. Programming is difficult to teach and most programmers learn their chops by looking at working code and using it as the basis for building their own programs. In other words, they "scavenge" the good bits and tweak them to a new purpose.

        The term scavenging appears to have first surfaced as a formal concept in a 1992 paper by Charles Krueger of Carnegie Mellon University. It was tested by academics in the 1990s but rejected because it yielded few gains for a lot of effort.

        According to Hackett, code scavenging is worth re-visiting because the Web makes it easier to find code and re-use it. He points to sites where massive amounts of existing code are available for potential scavenging such as Google code search, Sourceforge, Code Project, Microsoft's Codeplex, and O'Reilly's Code Search. Others include the Free Software Foundation (FSF), FreeVBcode.com, Freecountry and Freshmeat.

        So, code scavenging is... um, re-use? Can anyone make better sense of that than I can?

        "In other words, they 'scavenge' the good bits and tweak them to a new purpose."

        Um, no. You scavenge the pieces you need, not necessarily the good bits. Have you ever been looking for some code to do parse phone numbers, and while looking at source, said "Hey! This looks like a great way to compare two lists!" Probably not. You're only looking for formatting code, so that's all you see, so that's all you get. Looking at source is not like looking at produce at the food store, where you can walk by the tomatoes and they catch your eye because they're perfectly ripe and really, really nice-looking.

        Rather than searching Google, I think every good programmer should take the time to create a really good library. I don't mean take the time writing great code, I mean take the time to organize it into a proper library: make one, clean, well-commented version; put things into variables, ($tableName in queries instead of the actual table name, etc.) and pull code from that when you need it, rather than just copying-and-pasting from the last place you remember using it and then changing all the variable names, table names, etc.

        I plan to make mine Real Soon Now. :-)

        >> So, how quickly would you run afoul of Intellectual Property laws doing this?

        > That's a great knee-jerk reaction.


        No, that's just the first thing that popped into his head. (Pardon me if I'm putting words in your mouth, Mr. Gambit.) With that one sentence, he did not say (or imply) "The only people who would use this are thieves." He just put out that question for people to discuss. That topic came up here just a couple days ago. [slashdot.org] I highly recommend reading that discussion. There are some very good points; among them, that if you publish something with no licensing info, it is copyrighted to you by default. (In the US at least,

        • A good comment! And the IP part is important, you just can't take any code you see, you can read it and code the solution yourself in most cases but there are some darn patent issues - stupid laws patenting algorithms !
          Another good point in comment "every good programmer should take the time to create a really good library" And this should go without saying, build your own library of templates and snippets you carry as long as you do coding, often saves a lot of time. Some companies which use SM/CM ( source
        • Re: (Score:3, Insightful)

          These are the ideas that CPAN, PEAR and other code repositories are built on. So instead of trying to reinvent the idea, the author should have poked around a little more, and to learn more about what is available as opposed to trying formalize a "hackup job."

          The parent makes a good point here, where you should take the time to build a clean library base to work with. If one should have a well structured infrastructure, and should they need to implement a certain feature, more often is the case, that a g

        • Re: (Score:3, Funny)

          [I]f you publish something with no licensing info, it is copyrighted to you by default. (In the US at least, and many other countries as well.) So even if you're looking at a site that is, say, clearly marked as a tutorial, that doesn't necessarily mean that you can use that code, unless the guy comes out and says the code is public domain/GPL/etc.

          Good point. I think I'll use it the next time someone comments about code of mine that is overly complex and convoluted.

          "You see, all the simple ways of doing it
      • I think that in some rare cases, code scavenging is helpful. However, in most cases, it does not lead to solidly engineered software. Basically you get a piece of code, review your code, analyze the interface, prefactor, debug, postfactor, and now it is working well. By that time, you could have written a rough draft, debugged, and postfactored in less time and gotten a more consistent codebase out of it.

        So I am not sure at all to what extent this code scavenging is sufficiently helpful to make formalize
    • Some code cannot be copyrighted. HTML, for instance, cannot be locked away and hidden in a vault because it relies on being there so the browser can render it at the time of the download. Those, HTML is a great "scavenger" language, because it is easy to learn new techniques by poking around the "View Source" code.

      Other code that cannot be copyrighted is copylefted. The GPL expressly guarantees that code can be used, studied, modified, and redistributed.

      Now... formalizing a lesson plan to teach stude

    • I, for one, .... always check the licence of snippets I use :)
  • by kclittle (625128) on Saturday December 01 2007, @12:23AM (#21541607)
    ...since they obviously aren't going to be using it for much longer...
  • Vernor Vinge (Score:5, Interesting)

    by boster (124383) on Saturday December 01 2007, @12:25AM (#21541615)
    In A Deepness in the Sky, Vernor Vinge posited Programmer Archaeologists would replace all new development. http://everything2.com/index.pl?node_id=760521 [everything2.com]
    • Not like it took deep thought to arrive at that conclusion - a look at the nearest sysadmin (myself just as guilty) and the wee habit of script-scavenging is sufficient to serve as a parallel.

      /P

    • Re:Vernor Vinge (Score:5, Insightful)

      by bcharr2 (1046322) on Saturday December 01 2007, @02:41AM (#21542189)
      Code is basically an algorithm that solves a computational problem. So yes, you can cut and paste algorithms. If you want your application to be maintainable, however, you also have to solve the larger architectural problems, which is something you DON'T get when you cut and paste code.

      It is one of the most misunderstood concepts about programming. A programmer fresh out of college knows how to write algorithms really well, but has no idea that there are architectural and design land mines waiting for them just down the development road.

      Too many development shops will get an app "working" and think that is all there is to development, because no one has the depth of experience to look a year down the road and see that they will need to rewrite the entire app from scratch in order to make the simplest of changes.

      I'm sorry, but if your program is not extensible nor maintainable then you really haven't "succeeded" at anything. You've simply fooled yourself into thinking the process is simpler than it is while screwing your clients out of their development dollars.

        • Re: (Score:3, Insightful)

          The method of assigning homework is to give 3 lines on what a program needs to do. For example, write an FTP client that you can use to download a file. The method of grading is if you can download a file or not. The decision on how to get the HW done with the least amount of time spent is an architectural challenge.

          Which is what the parent hates: people who only think about getting the product out of the door and who sacrifice things like maintainability.

          Even firefox and Mozilla did a few complete rewrites of the various parts. Rewrites are part of programming unfortunately. The nice thing about rewrites is that the programmers now have experience on how to do things better and are able to better compartmentalize the code. I wouldn't say rewrites are terrible things - though they do annoy management and people above to no end.

          Firefox & Mozilla are prime examples of bad codebases. We all know the disaster that was the last rewrite in-house at Netscape.
          _SOME_ re-writes are necesary and usefull, but a company where the usual way to add a new feature is a re-write of the whole software is doing somethign wrong.

  • code repositories (Score:3, Informative)

    by drfrog (145882) on Saturday December 01 2007, @12:29AM (#21541639) Homepage
    like cpan and ruby gems etc

    we scavenge code online w/e, find it needs to be used by a lot of people

    so we inherit the scavenged and put it in a nice module and tada!

    this is nothing new

  • semi-formalized (Score:5, Informative)

    by mycall (802802) on Saturday December 01 2007, @12:35AM (#21541675)
    Some people are already doing this, such as koders [koders.com], code fetch [codefetch.com], codase [codase.com], and snippets [dzone.com]. Talk to them for formalizing as I'm sure they have some good input.
  • by Animats (122034) on Saturday December 01 2007, @12:35AM (#21541677) Homepage

    The Web 2.0 crowd rediscovers subroutine libraries. Film at 11.

    • The Web 2.0 crowd rediscovers subroutine libraries. Film at 11.

      You gotta punch it up, PHB-ify it: "Reusable Enabling Action-Oriented Web Object Architecture Patterns".
         
    • You joke but I've phone interviewed someone and asked him, "What are some ways of reusing code or implementation?" Inheritance or composition would have been okay. In fact, a lot of answers are acceptable.

      Instead, he answered "uh... cut and paste?"

      • The poster you replied to is thinking the "lots-o-books" kind (I'm sure). As in, "a library of subroutines".
  • Google Code (Score:3, Informative)

    by TooMuchToDo (882796) on Saturday December 01 2007, @12:40AM (#21541689)
  • by Yath (6378) on Saturday December 01 2007, @12:41AM (#21541701) Journal
    I guess this is slow news day. Using bits of code without writing everything from scratch - how novel! How controversial! Is there anyone who doesn't do this? What kind of skull-shattering boredom do you have to endure before you start writing blog entries about this?

    And the first article suggests that trusting the code is an issue, because you didn't write it. Well let's see - it's short, and you just pasted it into your program. But you're not going to bother to read it? You fail. Seriously.
    • I guess this is slow news day. Using bits of code without writing everything from scratch - how novel! How controversial! Is there anyone who doesn't do this? What kind of skull-shattering boredom do you have to endure before you start writing blog entries about this?

      It might be a new idea for more people than you think. There is this ideal picture of people writing The Perfect Software from scratch, isolated from the rest of the world. And there is the other ideal, people assembling The Perfect Software

    • You said it. I think they have a name for reusing existing code, it's called "Software Engineering."
  • by poppycock (231161) on Saturday December 01 2007, @12:42AM (#21541721)
    Isn't this, you know, a library?
  • by Anonymous Coward
    Why don't you scavenge the dictionary to spell properly?
  • Unlikely? (Score:4, Interesting)

    by SnoopJeDi (859765) <snoopjedi.gmail@com> on Saturday December 01 2007, @12:47AM (#21541747)
    From TFA:

    You are unlikely to find what you want with a simple Web search


    Since..when? Recently I've picked up perl again, and I've found more than what I need to scavenge to make my own personal extensions to blosxom [blosxom.com] through google searches.

    I mean, granted, it depends on your definition of a bite-size task, but it's a blanket statement no matter which way you spin it.
  • Isn't that basically the point of a linkable library?
  • by AEton (654737) on Saturday December 01 2007, @01:02AM (#21541813)
    If only there were some computer programming language that had built-in support for some kind of a Comprehensive Archive Network, that would be the best.

    Maybe the C++ language could do it. Then you could just ... hmm ... "import" the things you need from the Comprehensive C++ Archive Network!

    Hmm, CC++AN sounds pretty dumb. It'd never catch on. Oh well.
    • Hmm, CC++AN sounds pretty dumb. It'd never catch on. Oh well.

      Oh, c'mon! It has one potential use - It sounds like a munged pr0n version of something for Perl [cpan.org]. That alone would make the effort worth it, no?

      /P

  • by xxxJonBoyxxx (565205) on Saturday December 01 2007, @01:06AM (#21541831)
    Today Slashdot jumped the shark.

    Seriously. I'm starting to lose brain cells when I read the "articles" these days.
  • It's called Google.
    • And CPAN [cpan.org].

      (and half a bazillion other projects... Hell, not it's been going on before Microsoft decided they needed a working TCP/IP stack for Windows 2000 or anything...)

      /P

  • Wow (Score:5, Funny)

    by smitth1276 (832902) on Saturday December 01 2007, @01:50AM (#21542007)
    That article used a lot of words to say absolutely nothing. But it got me thinking... perhaps we could group related snippets of code into units called "libraries", and then we could easily use those libraries to perform common tasks?
  • Isn't that what microsoft is trying to do with PopFly?

    http://www.popfly.ms/Overview/ [popfly.ms]
  • How mainstream are SAE bolts? How mainstream is 18 gage 304 stainless sheetmetal? How mainstream is a CR2016 battery?

    Standardized and well-understood components save a vast amount of effort in other engineering fields and help produce results that are more easily verified to be good.

    Why not apply the same approach to software engineering? Isn't that the greatest promise of open source?
    • That is basically what libraries are, right?

      That is very different than taking a bunch of code, stitching it together, and building a system out of it.
  • The title of this article seems to have been 'code scavenged' ... it makes no sense and wasn't proof read. 'How far can can code scavenging go in the mainstream?' perhaps
  • by James Youngman (3732) on Saturday December 01 2007, @07:04AM (#21543061) Homepage
    ... of an SF book I read a few years ago, where all programs were written by a process of digging into 10,000 years' worth of computer programs in a sort of archaeological way, pulling out something that did more or less what you want and amalgamating it with what you had so far. I thought at first that it was a Vernor Vinge [wikipedia.org] book, but checking the plot summaries on wikipedia, it looks like it was somebody else. Can anybody remember the book I'm thinking of?
    • A while back, I asked one of my programmers to write a routine to dump out a Perl structure. I said I needed it in about a week. Lo and behold, it worked on-time and all was good.

      What is wrong with Data::Dumper? Or rather did you need something that worked with Data::Dumper?

      Turns out, as any Perl expert here knows, my programmer simply took the Dump module, which dumps perl structures. I wanted to have the dump be a nice dynamic javascript html table thing, and my programmer told me no -- or rather that it would require him to do it from scratch, and of course now, a month later, we didn't have the time.

      Having done a lot of work with Perl, I would say that Data::Dumper is the wrong thing to use for that. TemplateToolkit, OTOH, would allow you to quite quickly generate HTML based on your data structure. A competent Perl programmer shouldn't require too much time to do that, esp. if it is in a different document.

    • Ultimately, my point is that when you control every line of code, you aren't hampered by other people's restrictions. I would have been happy had my programmer written his own dumping code from scratch, but I also would have been happy had he started with the cpan dump code, searned from it, and created a derivative version. Hell, in this case, I'd have been happy if he had studied it to the point where he could have modified it easily.

      Sadly here is where you are terribly wrong. They key to doing what you want to do is to understand what you want from the beginning, do the requirements analysis up front, etc. and so forth. There is a *huge* difference between dumping a Perl object (the way Data::Dumper does) and creating a nice HTML document from it.

      Your programmer, if he was smart, grabbed the CPAN module, wrote a little wrapper around it, and integrated it into your application. If you had known what you wanted up front, he would ha