Is the New Microsoft Office Really Open? 511
joesklein asks: "From CNET, there is an article about the new Microsoft Office 11. In summary 'Microsoft says it's opening its Office desktop software by adding support for XML--a move that should help companies free up access to shared information. But there's a catch: It has yet to disclose the underlying XML dialect.' Could this be grounds for another anti-trust suit against Microsoft?"
Re:That's still to be seen... (Score:2, Insightful)
eXtensible Markup Language...
Just my $.02
Defaults (Score:5, Insightful)
If the XML files office produce are not made the default save types or if the XML merely encapsulates large portions of binary code, it will not matter one lick that office can save these xml documents because the majority of people will be stuck on the default, unreadable formats.
Can you copyright/patent a schema ? (Score:5, Insightful)
Of course it's not openly documented (Score:2, Insightful)
XML != Open. XML is more open than binary, because it's more readable and easier to reverse engineer.
But XML can reference COM objects. XML can have binary areas. XML is just a metaformat.
So exactly what do you call XML....? (Score:4, Insightful)
I think we've all had more than enough history to justify being suspect.
Fool me once, shame one. Fool me twice, and you know I'm a MS user.
I doubt you'll get an answer... (Score:5, Insightful)
Being that it's NEW, people haven't really had enough time to learn enough about it (as in actually using it) to give an informed answer.
Perhaps you should re-post your question in 2 months when you can get some informed responses.
This illistrates the shortcoming of XML (Score:4, Insightful)
NO! (Score:2, Insightful)
No it is not...
The Bush administration made it clear on the first day they wanted this to go away. As long as Billy isnt taking your 401K im sure no one is going to bother him for a while..
How many Millions were spent on this farce? and for what? a verbal reprind from the judge? think about it.. all that money could have gone into tanks and bombs to bomb other countries and free us all from "terror"
what does it matter (Score:5, Insightful)
Big deal if they don't open it up anyway (I don't really expect them to), staroffice/openoffice will crack it to a certain extent anyway. For most people's file conversions, its not that much of a difference to convert documents. Doesn't always look pretty, but it works fairly well.
Wake me up when something Microsoft does is suprising...
Re:Defaults (Score:3, Insightful)
Microsoft XML != XML (Score:4, Insightful)
Remember, you can also save a Word document as an HTML file, however the HTML is so digusting, so non-standard that the only things that could possibly read it are more Microsoft products. The same, I would presume, will be happening to their XML feature.
Additionally, its not too far fetched that Microsoft would make their own DTD (Document Type Definition).
Re:Can you copyright/patent a schema ? (Score:2, Insightful)
as far as content is concerned, anybody could write their own xml parser, what MS knows is going to sell more copies of Word et al. is the fact that it has a strong support for embedding ActiveX objects. So, the next time you want to embed a Rational Rose UML diagram in your word document, you'll most likely find that other software packages aren't going to interpret how this is stored in xml as well as the MS Office suite could.
Re:LOL (Score:1, Insightful)
<DATA>
asdfafs%65356FG653$5#@$%6Asdtkasdt
.
.
.
.
</DATA>
</XML>
Re:Reverse Engineer (Score:5, Insightful)
The purpose of XML is to have buzzword compliance, and this doesn't defeat that.
(Of course that's not the purpose most other people use XML for, but we're talking about Microsoft.)
Points to remember... (Score:5, Insightful)
1) XML, SOAP and all these new technologies were pioneered by Microsoft
2) They killed all the standards they didn't pioneer (CORBA anyone ?).
3) There is NOTHING in the XML spec that _requires_ people to open up their schema definitions. Its purely a structure definition in the same way as Microsoft's old Word documents were stored, its just that now the markers are in Text format and any standard XML parser will be able to read the file.
4) Open Office can already read word documents even though they aren't in XML.
5) So can Word Perfect.
6) Using XML doesn't stop you embedding binary into the document, often people do this to store data (images for instance), thus an OLE reference might still be binary.
7) Pure XML and XSLT are great ways to use up all the power on your processor. Binary has previously been used here because its inefficient, if MS had opened the format up everyone would just complain that its too inefficient and its quicker to save using an older format. So MS are either trying to burn cycles or are customising the XML or their application for speed, is that wrong ? Would it be wrong if KDE did it ?
8) People won't switch to or from Word because of XML, Open Office and other tools will be able to read the Word files because other tools (Google for instance) need the format and MS can see real business need to allow them to see it.
9) XML is a meta-language as such anything can be written. Hell they could have a bitch of an external format and then a simple parser that makes it useful, but not tell anyone about the simple parser so everyone elses documents take years to load.
10) XML is the buzzword of today, OLE to be replaced by SOAP as the buzzword for Office next ?
Get off the high horse guys, whether its binary or XML is irrelevant, making something XML doesn't make it open. Thats like saying that everything you do makes sense, but just because people don't understand the Mayan Calendar and Ancient Greek they complain.
MS will always use Mayan and Ancient Greek, and we _can_ understand them, its just easier for them as its their native language and calendar.
Re:Defaults (Score:3, Insightful)
Obviously you haven't tried it. RTF has gotten more complaints from users than raw word Docs does!
Replace "RTF" with "HTML" and you've got a winner, though.
The problem is that Microsoft chooses to retain their obfuscated binary format as the default save type for documents.
It's not "obfuscated" so much as it's "optimized." The whole idea seems to be for Word to save as quickly as possible--which the doc file is best at for Word for some reason, probably becuase it's derived from how the program structures documents, and not how some document spec says documents should be handled.
If the XML files office produce are not made the default save types or if the XML merely encapsulates large portions of binary code, it will not matter one lick that office can save these xml documents because the majority of people will be stuck on the default, unreadable formats.
1: It's HIGHLY unlikely that MS's XML implementation will be unnecessary binary code. They have a doc-to-HTML converter allready, and the XML converter will probably just be an update of that.
2: You CAN change the default Office save format to RTF, HTML, old_doc_version, or just about any random 'save as' converter you have! (The only major feature I saw missing was the MHTML format.)
Adoption of standard no guarantee of interop... (Score:5, Insightful)
Microsoft (and Netscape) essentially tried the same thing with HTML. Sure, we're using HTML, but to actually view our HTML, you have to use our browser.
Adoption of a "standard" is no guarantee of interoperability. Understanding the conceptual underpinnings of the standard is just as important. The question is, when Microsoft says they are using XML as a document format, are they doing it because they believe in the principles underlying it, or solely for the cynical "this is what is selling now" aspect?
The body of HTML out there is an paresable, babble of a mess, largely because the two dominant browser makers did not respect many of the underlying notions of markup and hypertext to begin with. The state of the art progressed, but not in the way a lot of people wanted it to go.
This could bode poorly if the meme survives somehow that the Office format is now equivalent to XML. When it "doesn't work," who knows where the blame will fall?
Re:NO! (Score:3, Insightful)
After years of work, hundreds of thousands of lawyer man-hours, what do we have to show for it? "Expose your API's unless they are to do with security, and don't be bad again". Honestly, this should have been a bitch slapping of biblical proportions. Not only should the company have been broken up, but a tier 1 deity should have rained down the wrath of the ancients in order to make it happen.
Another anti-trust suit? I don't think anyone's going to be going down *that* road in a hurry.
Dave
XML can be as cryptic as binary (Score:5, Insightful)
The Office file formats will be open if M$ decides to:
Re:Defaults (Score:5, Insightful)
In an era of 2+ GHz computers with 7200+ rpm hard drives, it seems odd that Microsoft would be unable to write an application than can quickly save and open text files that, on average, run well under 50 kilobytes.
Boo Hoo Hoo (Score:2, Insightful)
Comments like this give me the creepies. As a software developer, the last thing I want is some entity telling me what my default format should be.
It's also indicitive of the elitist attitudes of many Linuxites. In effect, the poster is saying that users will never have the capability to inform themselves and make a choice as to how they want to use their computers.
Re:Points to remember... (Score:4, Insightful)
XML came out of "SGML for the Web" team sponsored by the W3C. I think this was back in 97/98.
Enjoy,
Re:That's still to be seen... (Score:4, Insightful)
The same point that most technical decisions are based on. Buzzword compliance.
Are you paying attention? It's Microsoft. (Score:4, Insightful)
Dancing MonkeyBoy doesn't hop across a stage for his health. He "loves this company" because it makes money as only a monopoly can.
Silly rabbit. Open is for kids.
FUD alert (Score:2, Insightful)
Nope. Microsoft can set the price of Office because the applications fullfill the needs of its customers. The fact that the file format is propietary has little if nothing to do with it.
The last time I saw StarOffice running on Windows, I damn nearly puked. It's written in something that looks like Java/AWT, the apps take bloody ages to load, opening a document takes even more bloody ages, the UI looks childish and the printing sucks. And I didn't really spend much time with it.
OTOH, the Office apps load damn near instantaneously on even a PII 450, opening even ~50MB documents with hundreds of embedded images never takes more than a few seconds, the GUI is consistent and tight, and the things just work.
Sun (and everyone else) has a problem if it thinks that it can compete with Office on Windows with that stuff, and unless they provide an alternative to VBA, they'll never even make a dent. There are hundreds of thousands of people who write full-fledged bussiness applications using VBA and aggregating Office functionality, and that's not something that a company will just throw away because the file formats are now compatible. w00t.
If anything, opening the formats up will increase the popularity of office suites in Linux, because people won't have to dual boot or whatever to a) be productive; and b) read the stuff that the rest of the world produces.
Re:That's still to be seen... (Score:3, Insightful)
First, you don't have to reference a DTD to produce valid XML. SAX/DOM parsers will work just fine on a document without a DTD.
Second, you can have "binary" data in an XML document. Just base64 encode it.
Third: the point of going to XML if you're just going to produce a mess? Simple. You get to claim openness. Most PHBs probably don't know the difference between turly structured, stable, "open" XML, and syntactically-correct but semantically-useless XML.
Re:Defaults (Score:5, Insightful)
But if you really think they have no right doing these things, go live in a 3rd world country; they generallly have the government telling you less about what to do. Except once in a while when they kill your familiy. You could be armed of course. You know what a totally armed society with a weak government looks like? Afghanistan.
That being said, it's hard to see what business the government has engineering document formats. They could, on the other hand, specify disclosure of formats as a remedy in an anti-trust case, but they generally fall into one of two categories which precludes this: stupid or bought.
Re:NO! (Score:2, Insightful)
OK, so is this a good thing or a bad thing?
Re:That's still to be seen... (Score:5, Insightful)
The registry in Windows NT/2000/XP is sort of like that. It makes a lot more sense from a Microsoft-centric viewpoint than it does from a non-Microsoft-centric viewpoint. Now that it's been around so long, there are lots of ways to get at registry data (for instance, using Perl modules), but when the registry was new the only way to do it was through the Microsoft API, but until many people went through the pain of encapsulating the MS API, the pain of accessing the registry from a non-MS-centric toolset was high.
So maybe the XML format will be like that. If you're Linux-centric, for instance, the threshold of pain for accessing Word XML docs will be fairly high, but if you're Microsoft-centric, with all of their tools, code-snippets, documents, etc., then it won't be nearly as painful.
This way MS gets to claim interoperability, make Word data easily accessible to MS-centric solutions, but put a damper on non-MS-centric solutions.
Re:That's still to be seen... (Score:5, Insightful)
I think an analogy to Frontpage is appropriate here. Sure, it produces HTML, but the result just doesn't look right unless it's viewed in IE. Maybe the dtd is referenced, but encrypted or otherwise proprietary. Maybe MSXMLVIEWER (whatever it may be called) doesn't need the reference to be in plain text.
There are any number of things MS could do to ensure that the document just doesn't look right in other viewers. Since formatting is the whole point of XML, people will use MSXMLVIEWER and whatever it reads will be the de facto XML standard, just like whatever IE renders is the de facto HTML standard.
or it just ain't xml at all.
While technically correct, the point is sadly irrelevant. As long as MS is effectively a monopoly XML will be whatever they say it is, for the majority of people.
Also you aren't allowed to put binary data in an xml document
Not true. It's recomended that you don't put binary in an XML document, but nothing prevents you from doing so. This is exactly what will give MS the ability to hijack the standard.
In conclusion they would have to break xml pretty hard-core in order to make their doc types proprietary.
Only in spirit, I'm afraid, but that will likely be enough.
Besides, then what would be the point of going xml in the first place?
To make documents searchable. This is an ability which is extremely valuable to anyone who has a large amount of information they need to access. The upshot is that the actual content will likely be plain text, though important markups may not be. Sadly, format is more important than content for a lot of people.
Of course, most people won't use the XML format at all, since it won't be the default.
Re:"Could this be grounds for another lawsuit?" WT (Score:3, Insightful)
- A.P.
Could new .XML doc format be LESS open than .DOC? (Score:2, Insightful)
It seems from the context of the quotes in the article, Microsoft is very much concerned about how interoperable Word documents are now that they have been reverse-engineered and implemented from scratch in OpenOffice / StarOffice, WordPerfect, etc.
Here's my theory:
Besides value-added features, such as the internet calandar and workgroup features that have been dropped, the best way to achieve this differentiation would be to engineer an incompatible default format (an obfuscated XML DTD or binary encoding format) for new Word documents, leverage their massive installed base of desktop users, and fire up the good-ole FUD-o-matic 9000...
Boom! Office 11 Ships, creating new, incompatible format with new, incompatible documents floating around the LAN, marginalizing the use of Open Source / "fringe" Office software.
MS FUD: "But Open Source / Free Software Word Processors just don't work properly with the cutting-edge features of Office 11!". "They don't have the new whiz-bang features like 'Enhanced' XML, which Office depends on."
No, Mr. Hacker, you can't use Open Office. The company policy is for everyone to use Microsoft Word, because we want everyone to be able to read everyone's documents. By the time the OSS hackers completely reverse engineer the file format, the damage will have been done. And the few glitches in compatibility in engineering compatibility into OSS Office Software will be more fuel for the FUD fire, emphasising how buggy open source software is, and Microsoft is the best choice for 100% correct display and authoring of Word Documents for your MS Office-Run Business.
And until Office 11 ships and they're ready to roll with this new spin, they can take advantage of the hype regarding XML and how wonderful their new file-format will be, see, this Open Office package isn't so special! We can do you one better! XML is designed to be Open, see?
Then, in reality, the new document format will be more closed to us, because we don't know how to read it. Trust me, they won't make it easy. They gain too much by closing up the new format and throwing away the key, profiting from the time it takes to pick and chisel away at the locks.
Closed file formats are worse than closed apps (Score:2, Insightful)
I think this is a more compelling "pitch" for open source that the usual line of "if you can't get the source you can't fix the bugs".
Re:Defaults (Score:2, Insightful)
RTF has been in office for years and it is an open, portable standard readable on many platforms and with many programs. The problem is that Microsoft chooses to retain their obfuscated binary format as the default save type for documents.
Even though RTF is and open standard, many programs which claim compatibility are still not 100% compatible, and can screw up things like embedded images. I supposed Microsoft's implementation of XML will be similar. It will be open, but the more complicated documents would still be displayed differently by non-Microsoft products. It would also force everyone to switch to Microsoft XML, or at least be compatible with it, retaining the dominance of Office.
Re:Defaults (Score:3, Insightful)
In an era of practicallity most offices are still running on 500mhz boxes with 128MB of RAM and 5400rpm HD's.
Re:Are you paying attention? It's Microsoft. (Score:5, Insightful)
On any Unix or Unix clone you can just run standard tools or write your own.
Unfortunatly with everything in a proprietary format you then end up having to build scripting languages into everything making all of your data files potential entry points for malicious code.
The move to XML has the potential to eliminate that sort of brain damage once and for all provided they actually open their file formats.
I hope they do it.. but given their past I'm not holding my breath given that the options are long term financial security for MS or Security for their customers and the risk of losing market share in the future.
Re:That's still to be seen... (Score:5, Insightful)
As some others have pointed out:
1) You don't need a DTD or Schema to have XML
2) The url used in a namespace declaration doesn't need to correspond to a real document
3) Even in case the document used a DTD or Schema, that DTD or Scheme where available, and the document actually validated against it, you still don't know what the hell the tags mean, the DTD or Scheme are just syntactical(and grammatical?) rules, and don't tell you how to interpret the tags or attributes.
4) You can always include binary data in an XML document(ie., base64 encoded)
5) The point of using XML is Buzzword compliance and *perceived* openness
There are more reasons why XML not necessarily = openness. But this ones are more than enough.
XML means nothing, it's just a way to define languages, is like an charset, just because I have a document that is ASCII doesn't mean that I understand what is written on it if I don't know the meaning of the words that are on it(eg., just because you know the name of each letter doesn't mean that you know the meaning of "lkasdertunxsjd", right?)
Even if a language is in XML, you still need to *document it* to be able to *understand* it.
Sorry if I was a bit rough, but I'm sick of people that assume that because something is in XML it's automatically open. That is one of the biggest myths the XML buzz-wagon is based on, and is spreaded by people
that don't really understand what XML is.
Please, before you post to
Best wishes
\\Uriel
What I Expect (Score:2, Insightful)
What I am hoping/expecting for in this new format is something like XSL:FO plus binary sections for ActiveX controls, etc.
For the 5 or so posters saying this will be something like:
I highly doubt it. They are on record in several places as saying they want these new files to be indexable and parsable with standard tools, and base64 encoded blocks I am sorry to say, are not indexable. But of course Embedable objects will probably be forced to manifest this way.Regarding the claims that this will be like their horrid HTML implementation, I think it is clear you've not done much work with XML. Either a document is valid or it is not. If its not valid, most parsers will simply reject the file (unlike HTML, which just deals with the problems). If a document is valid, there should be no tool that doesn't properly load and parse it into the DOM, unless it is somehow broken!
The question for me is how well they implement content-presentation seperation. Will there be a 'Word 11 XSL file' with the actual content of the file seperated nicely into tags like
or will the style and content be mashed together like so: This is the question I want answered more than anything, and I can't wait to see which way they go with it. If everything is seperated nicely, we may just have an excellent source for user-produced well-formed xml documents which can be integrated into XML-based content management systems with PDF-based presentation and HTML previews, etc.Re:Open? (Score:2, Insightful)
Re:Are you paying attention? It's Microsoft. (Score:3, Insightful)
Really? Excellent! Please point me to the specification for the MS Office format, so I can write a cross-platform tool to open their files.
Re:Defaults (Score:4, Insightful)
Nonsense. Screw and nut sizes have been standardized without government involvement.
Re:Defaults (Score:3, Insightful)
You are goddamned fucking lucky that the government tells you what the default values for things should be. That's what the government is there for, mostly; to tell you that the default value for a building is to have a fire exit and that it may not be locked.
That's a safety standard. The government does not tell you what color the walls should be, however. It doesn't tell you whether you should use carpet or hardwood on the floors.
But if you really think they have no right doing these things, go live in a 3rd world country; they generallly have the government telling you less about what to do. Except once in a while when they kill your familiy. You could be armed of course. You know what a totally armed society with a weak government looks like? Afghanistan.
Assuming you're talking about Afghanistan before the US bombed the hell out of it, you are wrong again. The government in Afghanistan told you exactly what you could or could not do. It told you what you could wear and how much. It told you how long to keep your beard. It told you whether you could study or not (if you were a woman). It told you what you could study. It told you who you could sleep with.
Re:It's XML, get over it. (Score:2, Insightful)
XML != open. XML only makes *syntax* clear (Score:3, Insightful)
Just because a file format is XML, it does not mean it's open. Even if it's "real" XML and not a wrapped binary dump (Vvjfio1@1/515...). All XML does for you is to make the *syntax* of the file format clear, not the underlying meaning. Analogously, in German, every noun begins with a capital letter, and root verb forms generally end with "-en"; this tells you a bit about the phrase "Mit grossem Bedauern haben wir vom Ableben Ihres Gatten erfahren", but it's certainly not enough to understand it.
Even an XML schema is not enough - that just tells you which elements can appear where and what they can contain. That's like knowing that a normal German sentence has the main verb in the second position in the sentence. This still doesn't tell you the meaning of the above sentence, though you can see that "haben" is the verb and "Mit grossem Bedauern" is the first part of the sentence.
For an XML language to be open, you need a full description of what each possible construct in that language means.