Software Distribution via Multicast? 23
RockyMountain asks: "When it took me over 24 hours to download the latest Mandrake ISOs, I got to wondering...why do we still put up with servers overloaded with zillions of simultaneous TCP connections, all sending copies of the same thing?Hasn't multicasting evolved to the point where there's a better way? A quick look at Freshmeat turned up no obvious candidates. Are there any protocols or programs for distributing software via multicast? Are there any evolving standards? Or are there fundamental problems with this approach that I am overlooking?" An interesting question. With my limited understanding of Multicast, I would think that, at the very least, if you are a software distribution site you can have software distribution "channels", where each channel serves one piece of software. Milticast clients wanting a specific piece of software would connect to the right channel and wait until the next time it starts serving the software from the beginning (or, in the case of an interrupted connection, when the channel gets to the appropriate resume point). Might such a system be ideal for multicast? Can any of you come up with others?
Some problems with multicast... (Score:1)
But a multicast model, is worth a look. On September 11th, it was, for the most part, a multicast model (broadcast tevevision) that got us our information. Most of the news web sites could not handle the unicast method. So, its a good idea, but like you said, not that much seems to be going on with it. I'd love to here some good, real-world implementations of multicast.
KidA
Why wait? (Score:1, Interesting)
Server sends the file over and over again, as long as there is at least one member of the mulitcast group.
Clients join the mulitcast group and start recording wherever in the file they happen to find themselves. When the file stream ends, it simply starts from the beginning again and the client procedes to capture the part that it missed the first time around before disconnecting.
The Problem with multicast distribution (Score:1)
What multicast is very good for is replicating installs. If you want to burn one image to every disk in a room full of computers you can easily start the download client on every computer and then start the multicast session.
Overall, multicast could be useful if anyone actually wrote convenient software to serve and receive it, and for the geek crowd that downloads distribution .iso files it actualy might work, but the normal internet public has enough issues with the current download-on-demand thing to be bothered with multicast downloading
For video and audio broadcasts it's just ideal. With one simple cable/dsl connection *anyone* can become an internet radio/TV.
How is the state of the multicast capabilities throughout the net? Do ISP's use it? Do they let their clients use it?
Re:The Problem with multicast distribution (Score:1)
Multicast is ideal for streams where you can join and leave at any time, and for that reason it's being used mostly for audio and video.
The primary use I see for multicast in the near future might be Usenet news. It's clear that Usenet is going to continue to grow, and it's already gotten to crazy sizes at about 175 gigs per day. Various multicast streams containing different Usenet hierarchies would allow many end sites to take in news groups that they wanted in sections, without straining the backbones.
A more realistic solution for the problem you're currently having with your huge download is probably edge caching. While it's not without problems, it would allow your ISP to avoid taxing the remote FTP server and the bandwidth after one user gets the file.
Multicast and/or caching of any sort of content work well when you have the following:
A large number of users
At a small number of points of access
Accessing a small number of data objects
That are large in size as compared with available resources
--
Dane Jasper
Sonic.net
Big Problem. (Score:2, Insightful)
Multicast video works because there is only X number of bits needed at any one time. Aka, a 24k video stream only needs 24k of pipe to work, having a T1 down won't help you get the video faster. But having a T1 will help you download an ISO image faster so you can start the install process.
Also streaming video and such does not require a perfect stream... if a piece is missing it just ignores it and goes on its way. But an ISO image needs to be perfect. If not you just made a nice coaster for your coffee cup.
The only way I see it working is if everyone agreed to download at the speed of the slowest link. And I'm not going to agree to let my DSL line go to waste so I can download at the 33.6k of the dialup user who wants to wait 4 days for a download. Also having to be perfect would require the server to resend anytime a client reported a lost or corrupted packet. One needs only to be familiar with Norton Ghost and a lab with one bad NIC or HDD to see the crawl this will result in till the bad box times out.
So while nice in theory I doubt it would have much benefit outside of a controlled lab environment where everyone is on the same high-speed connection and there is very little loss of packets.
Re:Big Problem. (Score:1)
Alternitvy, you could split the file into many chuncks and have a seperate broadcast stream for each chunk.
It ought not to be to hard to setup somthing like this with a bit of creative programming.
BitTorrent? (Score:1)
http://bitconjurer.org/BitTorrent/
It's not multicast per se, but seeks to avoid the horrific inefficiencies you've noted.
You could think of it as inspired by mojoNation, but it's a different architecture focusing on a different problem.
Here's how I see it working (Score:2)
I'm no expert on network protocols. I'm not even a software guy. So some of what follows may seem very naive. Bear with me and see if this makes sense. Here's how I see it working.
Data Rate. The server would send several streams at once on several channels, each one paced for a different data rate. For example, the T1 user would pick a different channel than the 28K modem user. Each channel endlessly repeats the same data set, over & over.
Keeping Track. Each datagram sent would contain an offset value that shows where it fits into the big picture. Thus, the client knows which parts of the whole have been received, and which ones have not. As we shall see, this helps deal with start time synchronisation and dropped packet issues.
Start Time. You don't even try to synchronise start times. If a client connects in the middle, so what. It just stores the second half of the data set, then stays on the line for the next repetition of the first half. The client knows when it has received the whole data set, because each datagram is tagged with an offset that shows where it fits into the big picture.
Missed Packets. This is the hard part. If a client misses a packet because it is dropped en- route, or for whatever reason, there are a few ways to deal with it.
Reliable Multicast Standards (Score:1)
The cornerstone technology to any reliable multicast system is FEC (Forward Error Correction) which is an encoding technique that can repair lost or corrupt packets.
We at Onion Networks [onionnetworks.com] have created a very solid FEC library that will form the foundation of our open source implementations of the reliable multicast protocols. The FEC library can be had at http://onionnetworks.com/components.html
opencola (Score:1)
Swarmcast (Score:2)
Alternatively, Gnutella or eDonkey [edonkey2000.com] like programs can be used.
Re:Swarmcast (Score:1)
Just FYI Swarmcast is developed by the OpenCola guys mentioned above. So it is in fact the same thing.
The basic idea is to make it possible to share the file between the downloading clients. That is, client A begins a download from server S. This is done in a normal fasion. After a while client B joins in. Server S then begins to transmit to B and also tell B about the existance of A and vice versa. Now A sends parts of it's download to B and B sends to A. Both still get data from S. This continues in the same fasion as more clients join the "mesh".
The smart part is that the file is first coded using a FCC (forward correcting code) algorithm. (Also used when communicating with satellites.) Basically you can think of it as RAID for packets. The packets are coded redundanty so you don't need all of the coded ones. (There are more
The same algorithm (FCC) can be used as is for multi-casting. (And the site contains links to papers describing this.)
Dr. Dobbs had an interesting article on this topic (Score:1)
I remembered this from the dead-tree edition, and luckily it's one of the articles that has full text available online.
Check it out here [ddj.com]...
Intel makes a product for this. (Score:1)
Because you're downloading Mandrake, this might not be useful for you. I'm posting this in case it might help some Windows admins out there. Intel makes a product called LANDesk Management Suite that does multicast software distribution. There are two things that I should explain at this point. 1) In the current version (6.4) of the product, multicast software distribution is an add-on that must be purchased seperately. 2) It is mis-named because it doesn't use multicast packets.
The way it works is that the server will send a command to one node on each subnet telling them to fetch the software from a specified location. Then the rest of the nodes will be given a command telling them to fetch the software from the designated computer on their subnet. So it should be called a multi-tiered distribution instead of multicast distribution. It works well and is worth looking into if you have to do this sort of task all the time.
New version (Score:1)
Onion Networks (Score:1)
Re:Onion Networks (Score:2)
My question for you is: Will it work over the Internet as is now, or do all the routers in between the source and the destinations have to be specially set up to handle the multicasting traffic? I did a number of multicast experiments a couple years ago and found multicast to be unusable over the net because the routers dropped all the packets.
--jeff
Re:Onion Networks (Score:1)
Do you get much feedback from users of your library? I'm curious what other projects are going on, especially open source ones that I could follow. How much can you reveal about the content distribution project you are working on? For example, what platforms, and when it will be on the market.
How long do you think it will be until multicast becomes the mainstream delivery method for popular packages over the internet? That would obviously take generally accepted standards, widely-adopted packages, and I expect a lot of expensive Cisco upgrades! Do you forsee it any time soon?