
Are Digital "Margin Notes" Possible Yet? 49
Stavo asks: "I'm looking for a robust, reliable personal knowledge management solution. As a professional researcher, I read a lot of text-based content. I prefer to mark up content, by underlining or adding margin notes. I also need to retrieve and search content. The low tech solution is printing the text and using a pen to mark up, then filing the papers. If I want to quote a source, I have to type the quote. With the advent of Tablet PCs and similar tech, I'd like to find a way to keep the content digital. In other words, if I download an journal article in PDF or HTML, how can I mark it up, save it, and later search/retrieve it? Shouldn't computers provide a better solution than voluminous file cabinets filled with dead trees?"
beat me to it ;-) (Score:2)
You're describing something that I have wanted to build ever since my advisor started handing me papers to read left, right and center. Unfortunately (or not, depending on how you look at it), I haven't had enough time to do more than think, "wow, what I really want is a database that can hold these papers, do some kind of semi-intelligent indexing, keep notes, and figure out what the BibTeX entry should be."
From the little bit of research that I've done, a lot of the pieces for this are already out there (i.e. APIs for manipulating PDFs, database engines, indexing engines, etc.) but I just haven't had the time to put any of it together. Anyways, if anyone does have an answer please let me know about it ;-)
Re:beat me to it ;-) (Score:2)
Is this really that hard or do your needs go far beyond the capabilities of what I just described?
Re:beat me to it ;-) (Score:2)
Use the Note Tool (Score:2)
Re:Use the Note Tool (Score:2, Informative)
Acrobat c1999 (Score:5, Informative)
Unless I'm missing something, the full version of Adobe Acrobat can do all that. Annotations in text, voice, file attachments, etc. and a file indexing service "Adobe Catalog". Any PostScript output can be turned into a PDF, there are even free tools to do this on Linux. But if you're using Macintosh or Windows, you can print directly to PDF format. Acrobat 5 can even render web pages into PDF format, preserving links. IIRC Adobe also has a fully functional time limited demo available.
Now, getting those dead-tree file cabinets into PDF format is another problem alltogether. Possibly using overseas data-entry companies?
Yep, head on over to www.adobe.com and research.
Failure of Open Source world (Score:3, Informative)
gv, ggv, and gsview cannot.
Come to think of it, the Open Source world has seriously missed the ball in general when it comes to PDF documents. Open Source PDF viewers suck. In every single Open Source PDF viewer I've used, I've run into documents where the renderer has the orientation wrong -- and not just the orientation, but the "orientation of the bounding box" being different different from the "orientation of the drawn data on the bounding box", so that the top and bottom of the drawn data is lopped off, and there's a ton of white space to the left and right.
The only Open Source PDF viewer I've used that can (gasp) search for text is gsview, and it's *really* flaky and doesn't highlight found text. Nothing like trying to read through a page of text to find the one word you're looking for.
I've never used an Open Source PDF viewer that can antialias embedded bitmap images, which makes things look awful and unreadable.
Finally, Acrobat Reader for Linux is completely awful, and leaks memory like a sieve. I have a friend with about a gig of RAM that Acrobat Reader sucked through in about six minutes of dragging and scrolling the document.
Since PDF-viewing is one of the major office activities (along with world processing, and email), this is an enormous impediment to the use of Linux (or any UNIX) in a desktop environment.
It's extremely embarrassing to say something nice about Linux, have a friend use it, and then realize how truly much Linux software sucks at handling PDFs. "You mean I have to read through this thing manually instead of searching?" "Why does this print turned sideways? It works fine on Windows!" "Why does this look so bad?"
I predict Linux will not take off on the office desktop until (a) OpenOffice doesn't look and work completely differently from every app out there, and is free of cosmetic bugs *and* handles MS Office documents almost flawlessly, and (b) PDF viewing doesn't suck.
And for home use, (c) until the Linux sound architecture doesn't completely suck. Right now, the only way to obtain software mixing is through a dropout-prone, non-real-time-scheduled sound server with lousy latency. They usually don't share the sound device very nicely, either. Many sound systems can't do hardware mixing. Linux doesn't have a single way to do software mixing fallback, where a user out of hardware channels will automatically do real-time-scheduled software mixing. Pretty lame. Oh, and at least esd has truly awful resampling. Usually, when new users come to Linux, I hear "why is my sound dropping out when it doesn't on Windows", "why is there lag between something happening and a sound playing", "why does my sound sound so bad (this when resampling is occurring", or "why can't I hear ICQ sounds when xmms is playing?"
xpdf? (Score:2, Informative)
Perhaps I am simply luckier, but I have never had xpdf get the bounding box wrong in either fashion you describe.
Regardless of luck, I search for text in xpdf without trouble* at least several times a week, for months.
Check it out, at freshmeat [freshmeat.net], for example.
* by `without trouble', I don't count xpdf's nearly overbearing ugliness as `trouble'. :-)
Re:xpdf? (Score:2)
Re:xpdf? (Score:2)
OTOH, you are correct about the bounding box -- xpdf *did* correctly orient the bounding box, whereas gv, and ggv failed on the Linux Alpha Centauri manual that I just tested.
I still can't search, though.
Re:Failure of Open Source world (Score:2)
Also, esd makes mixing sound simple, and there already are drivers for mixing two streams of sound all over the drivers.
How is it you've missed this functionality? Beats me. Maybe you were too busy whining about it on Slashdot.
Re:Failure of Open Source world (Score:2)
Just to make absolutely certain you're wrong, I *just* pulled out a copy of xpdf, LaTeXed a document, and searched for a word. It registers no hits.
I'm not pulling this out of my ass. I've used a ton of ps/pdf viewers, wrote my current print filter, and do tons of ps and pdf processing each week. PDF support for Linux is bad. It's quite true. I use Linux (only Linux) as my desktop environment, and those of you that read my posts know that I'm a tremendous Linux fan. Doesn't change the fact that the PDF support sucks.
Also, esd makes mixing sound simple, and there already are drivers for mixing two streams of sound all over the drivers.
I'm not sure what you intended to say here, but you neatly avoided my complaints. (a) esd sound resampling quality and latency sucks, (b) there's no way to play sound and have the system opportunistically use hardware channels until it runs out and then use software fallback. I did, in fact, run out and purchas a SB Live just so that I could get multi-channel sound on Linux. Had I been using Windows, I could have made do with my older sound card and had exactly what I was looking for. This is not something to sneeze at, telling someone that they can "use a new operating system, but they have to buy new hardware to make up for a deficiency in the sound system".
How is it you've missed this functionality? Beats me. Maybe you were too busy whining about it on Slashdot.
I'm thinking the same thing, but about you not reading my question.
Re:Failure of Open Source world (Score:2)
Re:Failure of Open Source world (Score:2)
xpdf does have the ability to search through text, and if you can't find it or make effective use of it, that's really your problem then, isn't it?
PDF support sucks because PDF sucks.
Why do you expect others to write drivers for your existing hardware? And why do you whine when you didn't check to see whether your hardware was supported in the fashion you wanted before taking the Linux plunge?
Sounds like someone convinced you to use Linux, you ran out and bought a copy (or invested a pile of time in it,) without doing some rudimentary research first. Your friend is at fault for being a zealot without considering his actions, and you are at fault for not looking into the matter a little more carefully.
I apologize on behalf of the rest of us for your friend's over-enthusiasm.
Also, there already exist drivers that can mix sound together--but where do you want this to happen? In software? In hardware? Software solutions don't work so well because the streams may not match and may need to be resampled (a costly affair.) If you want support in the drivers you're using for the multi-channel hardware you might have (can SB Live play two completely different sample types at the same time?) then why not donate some hardware to someone who can do it or find some specs and write it yourself?
But check the ALSA project:
http://www.alsa-project.org
I read your question just fine--you apparently have forgotten there's this thing called Google.
Re:Failure of Open Source world (Score:2)
I love this argument -- Foo sucks because Linux has poor support for it. Seen it tons of times.
PDF isn't a closed standard, and the hard work is already done by ghostscript. PDF support sucks because Linux front ends suck compared to the Windows and Mac variants of Acrobat Reader.
Why do you expect others to write drivers for your existing hardware? And why do you whine when you didn't check to see whether your hardware was supported in the fashion you wanted before taking the Linux plunge?
I *have* drivers, you dolt. Read my message. I'm complaining about the lack of *any* support under Linux from falling back from hardware mixing to software mixing when you run out of channels -- your only option is to buy a card with so many channels that you'll never need more.
Sounds like someone convinced you to use Linux, you ran out and bought a copy (or invested a pile of time in it,) without doing some rudimentary research first. Your friend is at fault for being a zealot without considering his actions, and you are at fault for not looking into the matter a little more carefully.
I've been using Linux since the RH 5.x era, and exclusively as my desktop for years. I've used three different sound driver systems. I didn't just grab a copy off the shelf. Up until very recently, there was no free hardware mixing support at *all*, matter of fact.
Also, there already exist drivers that can mix sound together--but where do you want this to happen? In software? In hardware?
In hardware if the channels are available, otherwise fall back to software. Not that complicated.
Software solutions don't work so well because the streams may not match and may need to be resampled (a costly affair.) If you want support in the drivers you're using for the multi-channel hardware you might have (can SB Live play two completely different sample types at the same time?) then why not donate some hardware to someone who can do it or find some specs and write it yourself?
THE HARDWARE MIXING DRIVERS ARE WRITTEN! There is no *software fallback* support. And first of all, "write it yourself" is not feasible for *every* thing you lack (and I have added missing features to software on a number of occasions, thanks). The ALSA people have stated emphatically that they don't want to deal with software mixing, so they refuse to support this, and no one else has single device multi-channel (OSS/Linux calls this "multi-open") support.
http://www.alsa-project.org
It's *hardware mixing*.
I read your question just fine--you apparently have forgotten there's this thing called Google.
No, you misread it twice.
Re:Failure of Open Source world (Score:2)
Okay, let me be a little more patient with you, because you're going to a tremendous effort to write such long (and dorky) notes back:
1. I never said PDF sucks because Linux has poor support for it. I said exactly the other way around: Linux has poor support for PDF because PDF sucks. BIG difference, and it's too bad you missed it.
2. There are front-ends that don't "suck" other than xpdf. If you want the pretty graphics, and the cutesy hotkeys, and so on, then don't use xpdf. Personally I like the tiny footprint of xpdf. You obviously want something that looks like a Mac. Yay you.
3. Did you not just say that you had to buy a new sound card because your previous one wasn't performing up to your standards in Linux? And who's the fucking dolt? You're the one bitching about Linux drivers like some kind of petulant child. Are you aware of the kind of CPU wastage that happens when disparate samples are mixed in software? Perhaps the reason there is no software fallback is that there's no clean way to ensure that the impact of such mixing doesn't kill the system? Hm? Software mixing doesn't belong in the driver, and you're the dolt for thinking it does.
4. YOU were fucking whining about the lack of support for your older card: "I did, in fact, run out and purchas a SB Live just so that I could get multi-channel sound on Linux. Had I been using Windows, I could have made do with my older sound card and had exactly what I was looking for." So your older sound card didn't have enough channels for you? Jesus you're picky for someone using a free OS. And maybe there's a good reason why ALSA developers don't want to waste their time on software mixing?
5. The original post I was replying to was PDF support. I notice you've conveniently cut this out of your own notes. Does that mean I was right about PDF support under Linux? That xpdf does search through text after all? Hm? Perhaps YOU are the one misleading yourself, here: After all, xpdf *DOES* have text-searching capabilities, and you don't appear to be capable of clicking the little binoculars button at the bottom of the screen...
6. I love how you characterize the free software that you're enjoying that supports your multi-channel hardware as: "completely suck[s]". Oh, did you forget what you said? Here's a refresher: (Linux will not take off) "until the Linux sound architecture doesn't completely suck." What gratitude. What gratefulness. What willingness to volunteer to help.
What fucking bullshit.
You're just making an ass of yourself--give it up now before it's too late for redemption.
EndNote may help (Score:1)
in MS Word (Score:3, Funny)
(oh crap! this is slashdot...wait a minute, don't use MS Word!)
Re:in MS Word (Score:3, Interesting)
Oh god no! When I had a job (Don't worry, I didn't get laid off. I quit just before the implosion to go to gradschool.) I had write a design doc in Word. Track changes absolutly sucked. It couldn't merge two documents from a common ancestor at all. It said it did, but it couldn't. The only way you could get it to was to merge them one at a time.
It was an experience I wouldn't want to repeat.
Re:in MS Word (and MS OneNote?) (Score:1)
I have used this extensively when reading other people's documents and sending back suggestions. The only limitations I've seen vs. real margin notes is that you can't control the size of the font for the margin note (unless there is a way I'm not aware of)
Here's a (not so great looking) screenshot [iupui.edu] of the comment feature in word.
Also, look into the new beta OneNote [microsoft.com] from microsoft. I have note yet seen it, I just found it on the MS website while looking for a word comment screenshot. It looks like it's geared toward the TabletPC, but I can't tell from my brief reading if it can annotate existing documents or it's just a glorified notepad with its own file format.
Anyone do this with XML? (Score:1)
I've been a tech writer for years and entities from Sun to the local universities and utility companies all fail to implement systems of this sort for various reasons...often technical, more often political and financial.
I do believe it's possible, whoever...
Re:Anyone do this with XML? (Score:2)
BAH. I had written up a nice example of the format of such a document, using s-expressions ala Lisp. Which could very easily be translated to XML, one-for-one. However, Slashdot's silly lameness filter didn't like all the parens I used?
Annotea is the start of a solution (Score:4, Interesting)
Annotea [w3.org] is a W3C project. To quote from the site:
It provides annotation capabilities for HTML documents, and maybe XML documents, delivered in a web browser or similar UA.
Anonzilla [mozdev.org] is a project for providing Annotea capabilities for Mozilla. Check it out!
HTH
/mike
Amaya (Score:2)
Amaya [w3.org] has annotations buit in.
While I'm not about to start using amaya to surf the web and post to /. - it's fun to play with things liek the annotations. They can be stored on a remote server and shared apparently, but I've not tried that yet. All in all - I think it looks just liek what the posted was asking for.
Annotated parent post (Score:2)
Re:Annotated parent post (Score:2)
Re:Annotated parent post (Score:1)
hmm, I may have screwed that up. The annotated url is http://ask.slashdot.org/article.pl?sid=03/01/05/04 58208
but the annotation is apparently attached to 'an unknown' portion of the document...
Annotea apparently still has some rough edges, and I can't get Amaya to upload the annotations to the annotation server... *sigh*
Re:Annotated parent post (Score:2)
Hmm, just uploaded another via Amaya (got that to work), but that also attached to an unknown section of the document. Strange.
The Amaya one shows up in amaya in the proper location. both show up in the in the anonzilla toolbar, but as attached to an unknown section, but the Anonzilla aone created doesnt't show up in amaya.
/me shakes his head and decides to leave it.
Margin Notes (Score:3, Funny)
Short answer: No
Longer answer: Nope
Re:Margin Notes (Score:1)
I'd like to get that one for metamoderation.
Re:Margin Notes (Score:2)
Yeah, but most of the suggested solutions are half-ass and/or unwieldy. Yeah, everyone knows you can insert little text boxes all over a PDF doc. It's still a pain and not very flexible. I stand by my comment :)
Re:Margin Notes (Score:1)
Use Summation (Score:2, Informative)
Check 'em out here at http://www.summation.com
-- anthony
btw, the legalese term for "margin notes" (Score:1)
how about... (Score:2)
Adobe Acrobat (Score:3, Informative)
Short answer -- it works pretty damn well. But not with a mouse. A mouse just isn't suited to making marginal notes (i.e., checking an important idea, underlining a particular phrase, or circling an important passage). A tablet device with a stylus, however - that holds promise.
Other things to note: Acrobat provides two types of commenting systems. First, notations -- you can hilight, underline, circle, or freestyle directly onto the document. Second, "sticky-note" style comments. One very cool thing about the sticky-notes are that they render translucent so that you can still read the text underneath the note.
Also, as far as I can tell, the commenting systems appear to be embedded into the document as PDF code. Specifically, gv is able to render notations (hilighting, underlines, etc). gv is not able to render the sticky-notes, however. I don't know if that's because gv simply can't handle the sticky-notes or because the sticky-notes are in some type of proprietary format. xpdf doesn't render either form of comments.
So, if you're using Windows, are comfortable with proprietary software, and can afford $250, you're more or less set (assuming that pen computing lives up to its promise).
Things get a bit more tricky if you're looking for free-software solutions. As far as I know, there's nothing out there as of yet. And I don't know how difficult it would be to implement (I do know that it's way beyond my capabilities, however). But because it appears that Acrobat embeds the comments as native PDF code, it should be possible. The question is whether or not anyone's willing to take up the cause...
All you have to do (Score:1)
web page anotation (Score:2)
Other people with the same tool could then view the annotations.
does this still exist?
Re:web page anotation (Score:2)
ThirdVoice [c2.com], now defunct.
crit.org: public annotations on the Web (Score:1)
There's a short paper explaining this system at http://zesty.ca/crit/yee-crit-cscw2002-demo.pdf [zesty.ca].
Historical: CMU's Andrew User Interface System (Score:2, Informative)
Come to think of it, that's pretty much the experience with AUIS/ATK in general. In significantly less-nice terms, a friend of mine once said:
like all CMU code: way cool design, implementation like wet camel shit.
At this point, I believe that AUIS is pretty much defunct, so I doubt anyone cares. The code is probably available under OSS license if anyone cares (I believe it was old-style BSD (with attribution)).
This is exactly why HTML sucks (Score:2)
Using Acrobat is not an option, real markup of things on the web needs to be the goal.
You're not alone... I hope that's comforting.
--Mike--
Re:This is exactly why HTML sucks (Score:1)
My impression is that the group discussion/design was complicated quite a lot by multiple divergent interests (i.e. live annotations or not, shared or not, modified copy versus reference the original, etc.). I think this may be an area where a small group of people can achieve better success by presenting a mostly-finished design to peer revue (rather than consensual group design).
For my part, I think that such a system would be cool, but my practical needs would be met by a wiki that let you easily move forward/backwards among saved revisions, with some (optional?) way to view metadata about a change (who/when/what/why). CVS provides most of the tools (abstractly; I don't know that I would want to use CVS code).
Good luck! I look forward to hearing about your project in the future...
Possible technique? (Score:2)
Then, all you need to do is search for the "markers" and match the md5 hash with the comment. If anyone fancies implementing this, give me a shout. I think it would be quite easy to do.
crit.org: public annotation on Web pages (Score:1)
There's a short paper explaining this system at http://zesty.ca/crit/yee-crit-cscw2002-demo.pdf [zesty.ca].
Partial solution (Score:3, Informative)
Anyway, while it also lets you manage Word, Excel, PDF etc. files and web pages (and view them within its interface), unfortunately it won't let you annotate those. That would indeed be a very nice extra feature, maybe it should be suggested to ScanSoft. But still, scans of printed articles do make up a very substantial subset of research articles (my wife also does research and has to deal with this same issue), so PaperPort's features are still very useful. Plus it's a very inexpensive product, often included for free with $40 scanners.
Plenty of tools available now... (Score:2)
*sigh*
As some have mentioned, you can do this with Adobe Acrobat. You can get it on Unix. And no, it is unfortuantely non-Free. Call me crazy, but in my moral scheme, I have a much higher importance on reducing the amount I waste than using the occasional hunk of proprietary (although free) software. Killing something that was alive comes before thet GPL. I know, I must be nuts.
That said, I luckily do not have to use proprietary software for doing annotation. I have a little tool written in Squeak Smalltalk for annotating documents. Namely, I can annotate HTML, PostScript and PDF right now. You can add text (less storage space) or a drawing. There's even a handy little button where you can enable and disable the annotation marks.
In PS and PDF, I cannot resave as a PS or PDF with the new layers, but I can save in a format I can later open up and read. I can also do a fresh export to GIF or PostScript (and could then use ps2pdf if I wanted to share as PDF).
The app in question would run on any platform (Squeak is actually cross-platform- don't equate this with Java), except for the current version does some calls out to libraries in OS X, namely the AppKit. This isn't really absolutely necesary, with more work, it could be written to work with both the AppKit as well as GhostScript. Someone is making progress on a pure Squeak PDF renderer, so if that becomes even more usable soon, I could ditch the usage of Mac OS X's class library and just use that...
It can also annotate a "stack" of images (PNG, JPG, GIF), but you don't often come by documents in such way. However, it was super easy to add, so I did- and there are some docs I've come across in this format, e.g., a bunch of books where each page is a
And yes, this tool is completely open source and Free. I don't have it online for download, but it was such an easy thing to write, I assumed it was not something hard to come by. If people are interested, I could prepare it for such distribution...
For the PDA...
Also, the Newton can do it. Every eBook reader on the Newton I've used (PaperBack and Newt's Cape) can do annotation. Just tap the annotation button, and it interprets what you write as a drawing to annotate. To my knowledge, neither let you do pure text annotation, which would be nice I guess- but it 's better than nothing!
Amaya and annotations (Score:2)
ok, first of all - I'm not all that familiar with this - but here goes anyway. there is a W3 project called Annotea [w3.org] It is implemented in Amaya as annotations, which apparnelty can be stored on a remote server. It uses this RDF annotation schema [w3.org] and stored on a remote annotation server (the annotation server howto [w3.org])
When you have created an annotation for a piece of text, there is a pencil icon next to it. Click it and the annotation appears as a popup. It appears to be a very nice concept - but I've not used it much. I assume that teh annotations could be presented inline in the document.