Navigation
 
AutoCAD Feature

AutoCAD 2004 DWG: Not Encrypted, Honest!

By Martyn Day, editor, CADserver, July 17, 2003

         See Also

   Autodesk website
   OpenDWG Alliance website
   TopTen AutoCAD 2004 Sites - by TenLinks
Autodesk Directory - by TenLinks.com
Autodesk Reading Room - by CADdigest.com

On April 1st, 2003, the OpenDWG Alliance publicly announced that its users were having trouble reverse-engineering the new AutoCAD file format. The reason for the problem, the group claimed in its press release, is that Autodesk had used complex compression, together with data encryption techniques: "the inclusion of data encryption and compression schemes within the new file format has created serious challenges to DWG data interoperability."

This is a serious accusation. What’s most significant is not the mention of file format compression (most file formats are compressed) but the allegation that the DWG file is actually "encrypted." All sorts of negative and nefarious connotations could arise from the suggestion that Autodesk has adopted encryption; namely, that the company is trying to lock in its users and keep out the competition, since opening a file would require every user to possess a valid copy of Autodesk software.

OpenDWG's claims

For those not up on the wonderful world of data interoperability, the OpenDWG Alliance was born out of the need of Autodesk's competitors to reverse-engineer AutoCAD's DWG file format. To become a member, you have to tell the OpenDWG Alliance all you know about the DWG format, pay a yearly fee and in return receive the DWG libraries that its programmers generate. The OpenDWG's goal is to make DWG an open standard "usable by all programs that need to access valuable DWG data." Needless to say, with proprietary file formats being seen as a competitive advantage, Autodesk isn't a member (the de facto standard, DWG files are estimated to number some 3 billion). In my talks with Autodeskers though, they wryly point out that they would join the Alliance if they were to receive libraries of all the other member's proprietary CAD products in return (those by Bentley, EDS, PTC, Graphisoft, Nemetschek, ESRI. and CADKEY to name a few).

After reading the Alliance's release, I spoke with Evan Yares, the group's president and executive director. Yares was convinced that Autodesk had chosen to encrypt the DWG file, despite assurances from Autodesk representatives that they had not. He sent me a DWG file with a corresponding Hex dump, with an explanation of how this demonstrated Autodesk's encryption. Unfortunately, not being a programmer, the evidence was somewhat hard for me to digest.

In an issue of the upFront.eZine newsletter, the OpenDWG Alliance made specific technical claims concerning the compression algorithm: "Rather than having a single compression type, each object type appears to have its own individual algorithm, with a large number of special cases. Object compression is controlled by a 32-bit flag, which provides for billions of possible permutations. OpenDWG has reverse-engineered the compression algorithms for some objects, but substantial work remains to be done." And on the encryption issue, the group stated: "Both the file and section headers are encrypted, but in different manners from each other. While OpenDWG has been able to determine the algorithm used for both, it has not been able to determine if the encryption keys used to scramble the data will remain static, will change in each point release of AutoCAD, or will ultimately be changed dynamically under program control." (See DWG 2004 - Tougher Than Thought.)

Surely, faced with such a specific and detailed accusation, Autodesk would respond? But the company appeared to hold back. Carl Bass, executive vice president of Autodesk's Design Solutions Division, told me that the OpenDWG Alliance claims were "total nonsense and hysteria." It seemed that Autodesk was going to wait it out; in time, Bass expected, the Alliance would realize there was no such encryption and make a public announcement or apology to that effect.

The plot thickens

Meanwhile, Montreal-based viewing and mark-up developer Cimmetry announced that it had support for the AutoCAD 2004 file format in its forthcoming update to AutoVue 17 - the same DWG format that the OpenDWG Alliance claimed was encrypted. Cimmetry had reverse- engineered the file format without any technology input from Autodesk. While this could not be taken as a sign that 2004 DWG was not encrypted (Cimmetry has reverse-engineered both non-encrypted and encrypted file formats in the past), it was a sign that the format was at least decipherable by an outside organization. In his CADwire.net commentary, Evan Yares, this time in his role as industry analyst, chose to doubt the validity of Cimmetry's claim: "Either Cimmetry told the truth in their press release, or they lied. I've seen no evidence that Cimmetry told the truth." I assumed that this comment reflected his role within the OpenDWG and the extra pressure that group must have felt on hearing that someone else had cracked 2004 DWG.

I managed to get a copy of the updated 2004 DWG AutoVue to test it for myself and then showed Yares that Cimmetry had indeed cracked the 2004 format. Yares then honorably posted a retraction: "Cimmetry can fairly claim bragging rights to having delivered the first third-party viewer with AutoCAD 2004 support. I definitely owe them this recognition, and my congratulations." It is true to say though that reverse-engineering to read a file is only halfway there, the OpenDWG would have to be able to write 2004 DWGs as well.

All this time, Autodesk had remained silent. There were times past when I would have expected Autodesk to have fired off a few lawyers’ letters, or at least produced a qualified rebuttal. When I asked to interview Carl Bass on the subject of encryption, he agreed, and we were joined by John Sanders, executive VP of Design Solutions Division, and Mark Strassman, director of marketing for AutoCAD.

Autodesk responds

I first asked for Autodesk's reaction to OpenDWG's claim that 2004 DWG was encrypted. Bass replied, "The whole thing is actually pretty complex and I think people like Evan (Yares) have muddled the facts. There is an element of truth in many of the things they say but the gist of their argument is not correct. We changed the DWG file format for the customer's benefit. A lot of this benefit was around network performance and compression. We made no secret of the fact that we were compressing files and not very differently from something like a Zip file. We didn't use the same algorithm as Zip, but we did use a relatively standard and well-known compression method. As is often the case in computer problems, there is a trade-off between size and speed. Because of the interactive nature of a program like AutoCAD vs. zipping/unzipping files for email or archiving, we selected an algorithm that was optimized for performance."

Strassman added, "The smaller file size is one of the big features of 2004. One of the things Evan said is that we use different compression types for individual features and stuff like that, which is just false. We use a standard compression library throughout the DWG file. It's standard compression, there's no encryption. Compression just makes the file smaller."

Bass said that files were growing larger and larger and sharing them was beginning to overtax networks. "Now, if users only have to move 3 to 5 megabytes instead of 10, that's obviously better for everyone on the network. Just as we do when we send people big files, we often compress them, that's why people send JPEGs around instead of TIFFs and another reason why we came up with DWF. It's all about moving the information around more efficiently. We surveyed the customers and a vast majority run AutoCAD, or AutoCAD-based products, on a network, and this drastically improved the Open, Close and Save performance across a network. Compression is about better performance over networks - pure and simple. But if what we were doing was only that, you could decompress it at the other end and look at a file that was a 2002 file. We hadn't changed the file format in a while, because to do that means you have re-architect it. On the first release you do great. The second one gets a little bit messy, the third one gets pretty crusty and then you need to clean it up. And it's worth knowing where you are going for the next several releases, when you don't want to change file format. That's always been a barrier for users. So it makes sense to put in place an architecture that allows for that kind of extensibility."

John Sanders added, "Hopefully, we have a foundation for the next couple of releases so we won't need to change the format."

While that sounds a reasonable explanation, Autodesk changes its file format, on average, every 3 releases, which doesn't seem to be very forward looking. I asked Bass why this was the case when competitors such as Bentley managed to keep the same format for 15 years.

"If you go into Microsoft Word and you pull down the Save As menu, there are about 7 different file formats to choose from. You change formats because you have to add functionality. I really don't know enough to comment on Bentley."

Strassman explained, "A lot of the changes in the past have been considered minor because we were just stuffing more information into the DWG file format. This time, we wanted to make it so the file would really have a future, so we could eventually add new features to AutoCAD without changing the file format."

But wasn't that what the Object ARX programming language was developed for? "That's what ARX did from a code standpoint, but not from a data standpoint," Bass replied. "We've just made a much more flexible way to add data for us and for our third-party developers. They have a better mechanism to get at the data. So that's what that was all about, it's all geared toward customer benefit. That's why we changed the file."

Strassman expanded on this point: "We also did a bunch of other things ato make it easier to recover files, to make it easier to see when a file is corrupted. We added all sorts of things for the functionality of the user, which will hopefully allow us to avoid changing the file format significantly in the future."

"Mark brought up an important point," added Bass. "People have always wanted a reliable 'recover' command. AutoCAD always done a reasonably good job but it has never been 100% perfect in that area. But data recovery is a huge user request, especially because people get corrupt DWGs written by third-party DWG libraries - which is ironic in the context of this conversation. The problem is compounded when an architect, for example, creates a file, sends it to a consulting engineer and they apparently corrupt the file and then they want to be able to recover it and get the information back. So we have put in more mechanisms to make sure we could actually provide a recover in more circumstances.

Encryption vs. encoding

On broaching the encryption issue, I reiterated the OpenDWG's specific arguments concerning their findings, namely that the file headers are scrambled with a 128-byte magic number. This statement incited rebukes all around from my interviewees.

Bass and Saunders both chimed in, "Wrong, wrong, wrong!", while Strassman added, "There is no magic number."

"They will find out later," said Bass. "That's why I haven't been particularly interested in responding because the competitors just look incompetent."

What about the other OpenDWG allegation - that the sub-headers in the file are encrypted with a 4-bit key? "There is no encrypting," Bass replied. "None. There are no keys, they are wrong. I'm more than happy to make a statement about it not being encrypted, except for the password protection. We actually believe that customers should have total control over their data. As an example, we had a choice when it came to putting encryption for the password protection in the file, whether or not to have a back door to access the data and we decided not to. There is nothing special that we can do to that file that a user can't do. It's totally down to user control. We believe fundamentally that users have a right to control their data."

The ownership of data is an interesting point. Should users only have access to their data though an Autodesk product? Bass replied, "If they created it in a DWG file format that's not from an Autodesk product, I don't think we are that involved in this question, right? If they created the DWG in MicroStation, we would have no real involvement in that conversation, they just happen to have chosen our format but we have no obligation, one way or another. The relationship is entirely between that user and Bentley."

If a non-Autodesk customer was given a DWG file to work with, does Autodesk believe that person should buy a copy of its CAD software to open it? "No, it's up to them to look along the axis of price, features and fidelity," answered Bass. "I think that person is someone we don't have an obligation to. Are we interested in them becoming a customer? - that's a different question than whether we have an obligation to provide them with free software or with certain functionality." Bass added that he was actively considering allowing an independent third-party to reverse- engineer the DWG file format to adjudicate on the issue. He said he wouldn't pay for it to be done but the doubters could. To offer DWG up for independent analysis would be a foolish thing to do if 2004 were indeed encrypted. I took this alone to be a sign that Bass was willing to put his head on the block to state that AutoCAD 2004 DWG was not encrypted.

So, on the question of encryption, an emphatic "No" from Autodesk - along with some bruising comments on the capabilities of the OpenDWG Alliance. Speaking of which, I've seen correspondence between Autodesk and the OpenDWG, in which the former admitted there was some "simple encoding" in the new DWG. I asked Mark Strassman what this meant.

"There is a big difference between encoding and encryption," he said. "Encoding, when used in a software development context, typically means translating some concept to a digital form for use by the computer. For example, ASCII is an encoding scheme for the English alphabet and punctuation. In ASCII, the letter 'A' is encoded as the value 65, or 1000001 binary. In fact, ASCII stands for ‘American Standard Code for Information Interchange.’ Thus, letters placed into this "code" are encoded in ASCII form. Similarly, in AutoCAD we have to translate things like geometry and attributes into a digital code to be interpreted by the computer and stored on the hard drive. DXF is one form of encoding. DWG is another. So, the concept of a red line from 0,0 to 1,1 would be encoded as some series of binary numbers in the DWG. This is the manner in which we used ‘encoding’ in our original email with Evan's team.

"Encoding is a word in common usage in software engineering. Unlike encryption, encoding does not imply any attempt to hide or obfuscate information. Because laypersons occasionally confuse the usage of ’encoding’ and ‘encryption’ we have stopped using the term encoding when referring to the DWG. The only encryption in the AutoCAD 2004 DWG is file password protection, which is totally under the control of the user and is there to allow for secure transmission of drawings solely at the user's discretion."

OpenDWG support

On May 16th, the OpenDWG Alliance announced official support for the 2004 variant of the AutoCAD DWG file format. In the press release Evan Yares stated that, "Although we had no significant concerns about being able to implement support for the AutoCAD 2004 DWG file format, there were enough variables that the task was not trivial." This statement, to me, is a bit rich since those "insignificant concerns" had generated an attacking press release only the month previous. The release went on to claim that, "In AutoCAD 2004 DWG, a comprehensive compression algorithm is applied to almost all data structures, and the file and section headers are encrypted using a magic-number/XOR algorithm." A cautionary note to users added that, "Despite the fact that we support the format, users should continue to be cautious about using AutoCAD 2004 DWG files for projects which require long-term data access as the format does contain encryption."

Again, a series of specific claims, although the Alliance seems to have dropped an earlier allegation that there were billions of magic number permutations and that the encryption could be changed on the fly without introducing new builds of AutoCAD.

Conclusion

The AutoCAD 2004 DWG encryption debate is a very complex one to follow, as relates to both an understanding of the technology itself and the semantics of the arguments. I have to believe Carl Bass when he says that Autodesk has not encrypted the file - not in the classic definition of running an algorithm over the DWG to hide its contents from all outsiders. If that had been the case then it was a failure because Cimmetry announced support within a month of AutoCAD 2004's shipping and it only took the OpenDWG Alliance a month beyond that. Besides, Autodesk is savvy enough to foresee the kind of bad press that would result from doing something as outrageous as encrypting the basic DWG file and, in effect, trapping their customers.

That said, Autodesk has done a major amount of work to the AutoCAD file format and data compression is, in some way, data obfuscation, where data is reduced in size via a formula (algorithm) and reconstituted on loading. As compression is standard across the industry, one could hardly point a wagging finger at Autodesk. I have it on good authority, however, that the compression used in AutoCAD 2004 is very complex and doesn't appear to give DWG any greater ratio of compression than PKZIP provides. So why adopt it? I haven't had a sufficient explanation from Autodesk.

Although Autodesk obviously hasn't overtly encrypted the 2004 DWG, certainly it isn't in the company’s interest to make the reverse-engineering process any easier for its competitors. Autodesk has made code changes to AutoCAD LT in the past to "dissuade" application developers from coming up with applications for AutoCAD LT. It may well be that this was taken into consideration when the new DWG was being devised; with all the changes, improvements, and semantics it's difficult to see the big picture.

If anything, Autodesk should be more worried about why people were so willing to believe that it had overtly encrypted the file format. Indeed, on my travels it appears that the general perception is that Autodesk has used encryption; "It sounds like something Autodesk would do” being a typical response. In light of such widespread negativity, perhaps Autodesk should respond with more openness to its customers and to the industry at large. Autodesk's negativity towards Adobe's PDF format makes one think that format definition and control of those formats really does matter to Autodesk.

One of the biggest issues in the CAD world is interoperability, the battle between proprietary systems and open formats. Nearly all CAD systems are in some way proprietary because they are devised and controlled by the company that originated them. As a customer of these software firms you own the information that is stored in their "wrapper'" (file format) but do not always have an independent way of gaining access to that information. The OpenDWG Alliance claims it is acting for the good of open systems. But it's worth noting that the group’s existence is funded by competitors to AutoCAD and its DWG format. I think this is an awkward position to defend. As for Autodesk, it tends to rest its "open format" laurels on DXF, developed in the 1980s to solve the company’s own problems of transferring DWGs between incompatible operating systems.

There are no real saints here, on either side of the divide.
 

About the Author

Martyn Day is group editor of MCAD Magazine and AEC Magazine. For more information, visit the CADserver website.

Related Articles