Slashdot Log In
Good Books On Programming With Threads?
Posted by
timothy
on Tue Oct 07, 2008 11:53 AM
from the alternative-is-to-program-naked dept.
from the alternative-is-to-program-naked dept.
uneek writes "I have been programming for several years now in a variety of languages, most recently C#, Java, and Python. I have never had to use threads for a solution in the past. Recently I have been incorporating them more in my solutions for
clients. I understand the theory behind them. However I am looking for a good book on
programming threads from an applied point of view. I am looking for one or more texts that provide thorough coverage and provide meaningful exercises. Anyone have any ideas?"
Related Stories
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
PThreads & Java Threads (Score:5, Insightful)
However I am looking for a good book on programming threads from an applied point of view. I am looking for one or more texts that provide thorough coverage and provide meaningful exercises. Anyone have any ideas?
I went through grad school not too long ago for Computer Science (disclaimer: it was the kind of computer science degree that doesn't focus on hardware so I might not be the best expert on this). Anyway there were two books for the class.
... wasn't concentrated specifically on applications like you ask but very good reference. Also, I think there are a lot of good books free online [llnl.gov] in respect to that topic.
One dealt with coding regular old C on a plain jain Unix machine and method of (I believe there are others) doing multithreaded in that environment is PThreads [llnl.gov] (or the super short overview [wikipedia.org]). The book we used is the Addison Wesley book (ISBN 0-201-63392-2) [amazon.com]. It was informative and comprehensive
As for Java, there was an O'Reilly book [oreilly.com] (there's probably a new version out for Java 6) that was pretty good. Not as great of a reference but better on applications of threads in Java. Although, as far as introductory material, I personally learned it all from java.sun.com [sun.com]. Although I can't vouch for whether this is an applied approach or not, I would suggest the concurrency tutorial and a good book on Java Patterns or even a design pattern wiki.
I've never done concurrent programming in C# or Python so I do not know first hand what is best. I do know that erlang [erlang.org] has been fun to mess around with in my spare time though!
Recently I have been incorporating them more in my solutions for clients.
Most important rule of thumb of multi-threaded programming is to avoid it if possible. Maybe hardware (multi-core) will change that, maybe you feel the scheduler can't do its job as well as you can and maybe you feel it's more intuitive. But, often is the case, that you're just adding more complexity to your code resulting in more difficult bugs and harder maintenance for others. Keep it simple.
Re: (Score:2, Informative)
The Addison-Wesley book mentioned by the parent is "Programming with POSIX Threads" by David R. Butenhof. It's what I used when I needed to get up to speed on p-threads in a hurry - clear and easy to follow. P-threads are what's in Darwin, (and so BSD) Linux, and I'm guessing based on POSIX compliance, just about every commercial flavor of UNIX. (Presmuably, OpenServer uses fraying threads)
Re: (Score:3, Informative)
for the morbidly curious, there's even a pthreads library for windows. LGPLed
http://sourceware.org/pthreads-win32/ [sourceware.org]
Re: (Score:3, Informative)
Most important rule of thumb of multi-threaded programming is to avoid it if possible. Maybe hardware (multi-core) will change that, maybe you feel the scheduler can't do its job as well as you can and maybe you feel it's more intuitive. But, often is the case, that you're just adding more complexity to your code resulting in more difficult bugs and harder maintenance for others. Keep it simple.
Man, I have to disagree with you. That kind of dinosaur thinking will hold back progress. Multi-core is the future and multi-threaded apps are exactly what's needed to fully utilize its potential. I'm sorry if its too hard for you to debug but its just the way the cookie crumbles.
Re:PThreads & Java Threads (Score:5, Insightful)
Erm, the tenets of programming usually involve the general concept of "Eliminate the unnecessary." Therefore, the parent is correct: if multi-threaded processing is unnecessary, avoid it.
What you meant to add to the dicussion is the corollary: If it is unavoidable, use it wisely.
Parent
Re: (Score:3, Informative)
Erm, the tenets of programming usually involve the general concept of "Eliminate the unnecessary." Therefore, the parent is correct: if multi-threaded processing is unnecessary, avoid it.
Although unnecessary, threading usually simplifies a program rather than adding complexity. The only caveat is that you understand threading. In my experience I've used threading to greatly reduce the size and complexity of solutions that either were or could have been implemented without them.
Avoiding threads..... (Score:4, Interesting)
Parent
Re: (Score:2)
Re: (Score:2, Insightful)
I'm pretty sure that stuff was some rumor that came out before Barcelona was released about how the Barcelona core was going to 'destroy' Core2 (basically a load of wild and crazy speculation).
There's already some parallelisation of sequential code in all modern processors (out of order execution) that works well because it has fairly narrow focus on the instruction stream window. Going out further would be a much, much larger problem. Looking for parallelism in larger windows of the instruction stream, t
Re:PThreads & Java Threads (Score:4, Informative)
Multi-core is the future and multi-threaded apps are exactly what's needed to fully utilize its potential.
For each application you name that is benefited by threading, someone else will be able to name one that isn't. Some processes simply are not parallelizable in a meaningful way, where meaningful is defined as in speed of execution not as in the interactive extravaganza of "looky how I can clicky the button while it's still doing hard maths".
There's a good bit of reading about the subject, although much of it is boring and is often difficult to apply to real-world situations. Amdahl's law [wikipedia.org] in many situations can predict if it's worth bothering with multithreading (or other forms of parallelizing) quite easily.
A tool like cat or grep has no benefit of being threaded since it's a simple sequential task. Suppose you were to multithread "cat" into one thread that reads from disk, and another that displays a line of text on the screen. Thread 1 will spend most of its time waiting for I/O, and thread 2 will spend most of its time waiting for thread 1 to pass data. Except now, your multithreaded cat has a somewhat complicated synchronization mechanism on top of it that makes it a bit harder to debug and probably eats some extra cycles as well.
While the previous example is overly simple, there are plenty of tasks that are a lot more complicated but simply have no benefit of being threaded, because they spend more time waiting for I/O than actually calculating or because the algorithm is simply not worth parallelizing because there is no benefit in speed.
Another example would be an application divided in 3 steps. Step A and B can be executed at the same time independently of each other, while step C depends on step A and B. Both step A and B can be written to use two threads, and if they'd use two threads they'd run in half the time of their non-threaded equivalent. On a dual core machine (or 2 CPU machine) running step A multi-threaded and then step B multi-threaded takes 1 hour. In the other case, running step A and at the same time (on the other core/CPU) running step B single threaded also takes 1 hour. At this point you gain nothing by threading. Of course here I assume that I/O by both processes at the same time doesn't create some sort of delay. But if you're working with large enough data sets (more than you can keep in memory) this becomes less and less of an issue since the I/O overhead will already be there anyway.
If you add to that the fact that threading (especially synchronization) is a subject that is not well understood by everyone (in the "find me out of 200 programmers fresh from school, 10 who can write a program that benefits from multi-threading and actually works" sense), threading suddenly becomes less appealing if there aren't any clear benefits for the application you're working on.
The reason I mention that last part is that because so many schools give kids the "make two threads count to 100 then exit" exercise but fail completely at getting the message across of the fact that most of the time the threads actually need to synchronize with each other. They'll give this long lecture about the dining philosophers problem without actually SHOWING them what that means.
In conclusion: it depends on a lot of factors (size of your dataset, how well your algorithm can be split up in parallel tasks, ...) if your process benefits from threading or not, and you should evaluate at design time using Amdahl's law if there's an advantage or not. If your results in a multithreaded environment are only marginally better, the economical factor of cost of development time suddenly weighs in very heavily.
Having said that: if you're a programmer, have fun with threads at least once. Write something silly in your spare time, it can be an amazing amount of fun and often offers an interesting way of approaching future problems.
Parent
Re:PThreads & Java Threads (Score:4, Interesting)
My Master's Degree was in High Performance Computing from UC San Diego, and I taught parallel processing.
Yes, you're right that most new programmers out of college will screw up (and screw up badly) if they try to write a multithreaded application. Learning to write parallel applications requires more mind-bending mental gymnastics than, say, when you first learned to write recursive applications. That said, once you get a solid understanding of how safe parallel code should look like, and how it should work, it's fairly trivial to write code that works, and doesn't deadlock. From my experience, it takes about 3 to 6 months of pounding on parallel code to reach that state.
While it's not a trivial amount of time, given the importance parallel code has (and will increasingly have in the future), I don't think it's too great a hurdle to ask for people to learn this stuff. All talk about multi-core programming always boils down to "Well, we'll never find enough programmers who are able to write multi-threaded apps." Well... why?
I think it would be in the best interests of Intel and AMD to sponsor online college classes teaching how to do parallel coding. People aren't buying the new chips since code can't take advantage of it -- if they flip it around and make every program able to multithread (that could benefit from multithreading, as you point out, Amdahl's Law) then demand for their chips would surge, and they'd make the money back in billions.
Parent
Re: (Score:3, Insightful)
Re: (Score:2)
I used the llnl.gov stuff to teach myself pthreadsy. I'll second that it is DEFINETLY a good resource, although it misses a few points. Those points are fairly obvious after looking through the pthreads include file with your system and the associated man pages for the functions. RWLocks are a good example.
For C# I used the documentation in Visual Studio. Once you find the class reference, it's extremely useful and well done.
For python, I just go to the python web site. They have three different sets of doc
Re: (Score:2)
Maybe I'm crazy, but if you want a few tasks to be handled in parallel, how can the scheduler take care of it without threading? Forking?
Yes. The current framework of computer science education makes interactions between processes much clearer than interactions between threads. And for years, Apache used processes to handle new requests because it started on platforms with efficient creation of processes.
Re: (Score:3, Informative)
Most important rule of thumb of multi-threaded programming is to avoid it if possible. Maybe hardware (multi-core) will change that, maybe you feel the scheduler can't do its job as well as you can and maybe you feel it's more intuitive. But, often is the case, that you're just adding more complexity to your code resulting in more difficult bugs and harder maintenance for others. Keep it simple.
I'm going to have to disagree with you on this one. Especially in Java client side rich GUI apps, background threads are one of the most useful components to ensure a responsive interface when dealing with asynchronous requests. They really only need two and a half pieces to implement them easily and efficiently. The first component is the request itself, either a subclass of java.lang.Runnable or javax.swing.SwingWorker. The second is a callback handler. The half piece is the shared data structure, and it'
Nobody mentioned needles (Score:2, Funny)
Obligatory serious needle reply (Score:2, Interesting)
The Jacquard Loom [wikipedia.org] involves programming of a sort, albeit without branching or computations. In that sense it's more like a translator, translating punches into patterns. Sort of like printf for the clothing industry.
Mod Parent... (Score:2)
real world haskell (Score:3, Informative)
Re: (Score:2)
Language/Environment specific (Score:4, Informative)
Re: (Score:3, Funny)
if you're doing "enterprise" development it's best to avoid using them and let the application server do its black magic for you
Finally, confirmation!!!! I always suspected all those acronyms to be some form of arcane hex.
Re: (Score:3, Informative)
On the contrary, Pthreads, Java threads and .NET threads are mostly the same thing in different packages.
There are _really_ different ways to implement multithreading: fork-join model, pi-calculus, STM, message-passing model, etc.
Re:Language/Environment specific (Score:4, Informative)
There are _really_ different ways to implement multithreading: fork-join model, pi-calculus, STM, message-passing model, etc.
No, there are different ways of implementing concurrency. Threading, in particular, means shared-memory concurrency with a private control stack. Pi-calculus, STM, Linda and CSP are all examples of other models for concurrency, not of multithreading. They differ in many respects (although pi-calculus and CSP have a lot in common), but share one feature - they are all easier to reason about (and therefore to debug) than multithreading. The only valid use for multithreading is to provide an efficient implementation of one of the other models.
Parent
Free eBook on Threading in C# (Score:5, Informative)
I'm still getting the hang of Threading in C# myself, but I found this eBook immensely helpful in getting me understand some of the difficult issues such as Thread Safety, Cross-threading issues, Race Conditions, and Event-Delegate pairs.
http://www.albahari.com/threading/ [albahari.com]
For .NET (Score:2)
Concurrent Programming in Java (Score:5, Informative)
Re: (Score:2)
Re:Concurrent Programming in Java (Score:5, Informative)
Parent
Java Concurrency in Practice (Score:3, Insightful)
I highly recommend this book [javaconcur...actice.com] if you are doing threads or any sort of concurrent programming in Java. It's written by the guys who designed Java's concurrency features.
Win32 System Services by Marshall Brain (Score:2)
Before Brain was doing the HowStuffWorks podcast, he wrote what I consider to be the best book of how the low-level stuff in Windows works: Win32 System Services. I learned everything about threads in there.
That said, it's specifically for C/C++ and never mentions Java or C#, so the examples probably won't help. What I really liked was how he explained the Dining Philosophers problem, how mutexes, semaphores, etc., work, and even though he talks a lot about specific API calls, I really learned how Windows w
Two books (Score:2, Interesting)
First: Programming Erlang: Software for a Concurrent World
by Joe Armstrong
http://www.pragprog.com/titles/jaerlang/programming-erlang [pragprog.com]
The Erlang programming language is well suited to develop concurrent programs with.
The second book I'd recommend is
Distributed Systems: Principles and Paradigms, 2/E
by Andrew S. Tanenbaum
http://www.pearsonhighered.com/educator/academic/product/0,,0132392275,00%2Ben-USS_01DBC.html [pearsonhighered.com]
Not specific to any programming language, but a very good introduction to the concepts and methods u
not covered in books on threads (Score:3, Informative)
The thread model has some fundamental problems, but since they seem here to stay there are some things you should keep in mind, nicely summarized in this article [berkeley.edu](pdf).
Article also available in html [google.com] if you click on the first computer.org link from google. Hmm, why does it [computer.org] work from google and not from slashot?
Re: (Score:3, Informative)
Two books - work well together... (Score:2)
"Multi-Threaded Programming Techniques" and "Win32 Multithreaded Programming" (don't worry that it says win32, the other book handles the non-Windows issues.) Together they make an excellent coverage of the topic. Later, get into the particulars of how threads are implemented on different platforms, it's quite interesting and really reflect fundamental differences in where to stress the performance of a system.
Multithreading Applications in Win32 (Score:3, Informative)
Here's one I found useful: Multithreading Applications in Win32 [amazon.com] by Jim Beveridge and Robert Wiener. It's a little dated (no coverage of .NET, for example - it's more focused on C/C++), but it still provides a good introduction to threading and synchronization on Windows.
If you can find an inexpensive used copy, it's worth a read.
Python (Score:2)
Howsabout books or sites on Python threaded programming? I'm going to be working on a project in a short while which will require the use of GTK and twisted together in a sort of network scanner system with asynchronous results.
What the hell? (Score:3, Interesting)
You don't need a book about threaded programming in Python.
You need two books: one about Python, one about threads. Concepts are universal and can be applied across as many languages as you want. It's like saying you need to re-take Calculus because you just learned French!
Re: (Score:3, Informative)
Howsabout books or sites on Python threaded programming? I'm going to be working on a project in a short while which will require the use of GTK and twisted together in a sort of network scanner system with asynchronous results.
As much as I love Python, it does have some weak points, and threading is one of them. From the python documentation:
The Python interpreter is not fully thread safe. In order to support multi-threaded Python programs, there's a global lock that must be held by the current thread before it can safely access Python objects.
Threading is there, and I'm sure some decent documentation exists somewhere. But the GIL (global interpreter lock) generally means that there are better ways to approach the problem in python, i.e. processes instead of threads.
It's a point of contention in the community, and the GVR-BDFL point of view is that any attempt to remove it makes Python a lot slower, so he won't.
While I don't use
oldie but a goodie (Score:3, Informative)
Some background in parallelism is helpful for mastering threads.
I learned from this book:
http://www.lindaspaces.com/book/ [lindaspaces.com]
C-linda never caught on, but it's not hard to read the examples and apply them to pthreads, java, MPI or whatever framework you're using.
If you want a software design book (Score:2)
If what you want is to design multi-threaded applications (thus, more than just coding multi-threaded apps), then the book you want is: Concurrent Programming in Java [oswego.edu]
A little on the stuffy side at times (not quite as easy a read as Design Patterns) it still provides a deep understanding of the trade-offs and techniques used when designing multi-threaded applications. Personally, I found myself again and again using the lessons and many of the patterns of that book when designing new systems (or fixing syste
On Windows: "Concurrent Programming on Windows" by (Score:2)
Joe Duffy (of the .NET Parallel Extensions team) has an excellent book due to be published very soon:
Concurrent Programming on Windows [amazon.com]
I haven't found a decent book, but... (Score:3, Informative)
Herb Sutter has been doing a lot of work on this stuff over the last 10 years and his blog is full of stuff on what you should do... it's not too nitty gritty in terms of languages and stuff, but it's very informative in terms of understanding the issues and what not. Check out http://herbsutter.wordpress.com/ [wordpress.com].
Some rules of thumb that I've found useful:
I believe that following strict OO guidelines is even more important when dealing with concurrency than when dealing with general ideas in software... and let's face it, it's extremely important even when not dealing with concurrency :)
Re: (Score:3, Informative)
Locking sucks, but it's necessary. If you think you can get away without having to lock in a dubious situation, you're probably wrong.
There are lots of good, reusable, lockless data structures around if you know where to look. Keir Fraser's PhD thesis contains a really nice lockless ring buffer design (which he implemented for Xen) and several other useful things (including a transactional list and some other shiny stuff). If you have implementations of these in a library somewhere, then you can often get away without locks. There is one rule you should always obey when writing parallel code though:
No data may be aliased and mutable.
I can suggest three books... (Score:5, Funny)
Here are two good ones: (Score:4, Funny)
Weaving Technology, Joseph Jacquard, Colonies? What Colonies? Publishing, 1801.
Python doesn't have threads (Score:4, Informative)
That might seem wrong given that Python lists threading modules, but just look at Python's GIL to know what I mean. As in, no matter what you do, Python will still be running on one core. So, if you just want a performance boost because of a lot of I/O, then threads can get you there. Unfortunately, if you want to take advantage of a multi-core CPU with Python, Python's threads won't get you there. There has actually been a lot of discussion on this topic, but Guido just refuses to do it. The interpreter has no threads and the lib is not thread safe.
If you want to do multi-processing with Python, look at its subprocess module.
Guido's blog post on the GIL:
http://www.artima.com/weblogs/viewpost.jsp?thread=214235 [artima.com]
The FAQ entry on a (fallacious) reason why they won't remove it:
http://www.python.org/doc/faq/library/#can-t-we-get-rid-of-the-global-interpreter-lock [python.org]
Re: (Score:3, Informative)
Actually, the conclusion is not supported by the reasoning. For those that don't like clicking links, Guido's reason is that there exists a patch the removed the GIL and replaced it with fine grain locks. This failed miserably. BUT, when one thinks about it, this implementation would certainly be doomed to fail for obvious reasons.
When one implements fine grain locks, every time something is accessed, it is locked accessed and released. Clearly, this will impact performance on even a single threaded app
Re: (Score:3, Insightful)
No. IPC is dirt cheap. Take X11 for example -- that's perfectly usable using plain old named pipes, even without Xshm.
Most of the time, a threaded GUI program will want to use threads in order to perform some operation or another "in the background" while the UI remains responsive. If this operation has well-defined inputs and outputs, why not write it as a separate program? Communication overhead is going to be low.
Re: (Score:3, Insightful)
There's also the issue of process management. When the other end of that named pipe breaks, what happens to that separate process? Is it really dead? If it's still alive, how do you kill it cleanly?
I'm not saying separate processes are bad, I'm saying that they're appropriate for certain problems, just like threaded applicati