Here are some excerptions from a newsgroup postings by Geoff Chappell, which
draw a line between “reverse engineering” and “translation”.
Subject: Re: OSR: What makes them so damn special? (Was: “OSR’s Seminars,
Any Experience With?”)
Author: Geoff Chappell
Russell Johnson wrote in article
…
> My company had to pay huge sums of money for my “right” to
> chat with Microsoft engineers.
If Microsoft’s published literature did contain all the necessary technical
details, then it would be quite reasonable that Microsoft charge for
assistance in interpreting those details.
As a manufacturer of an operating system, Microsoft’s obligation to
programmers and consumers is discharged merely by publishing a cold
technical specification. Making the specification helpful in senses such as
holding programmers’ hands through typical exercises or showing them how to
put the technical details together to make real-world code is a commercial
decision but not an obligation.
As a programmer, it is your role to select from a technical specification
the pieces of information that you then put together to make some program
that does useful (and maybe even interesting) work. If you’re a busy
programmer and want someone to get you the answer rather than do all of your
job yourself, then you have to expect to pay whoever shares your work.
Unfortunately, this all breaks down because even the best technical
specification may have errors or omissions. Approaching the manufacturer for
help to do your work is therefore different from approaching an independent
consultant. Your approach to the manufacturer may be based on an assessment
that the manufacturer’s documentation is incomplete or plain wrong.
Microsoft will still want to charge you.
Now, this is fair enough if it turns out that Microsoft can solve your
problem by pointing to its documentation, showing which pieces of
information can be put together to do your work. As far as I am concerned,
it is still fair enough even if the documentation is so obtuse that only a
genius could find the pieces and realise that they go together to solve the
puzzle. However, Microsoft will still want to charge even if its assistance
requires information that is absent from the documentation. This just isn’t
right for Microsoft to do.
Greg Payne wrote in article
<01bd1b53$ad335560$xxxxx@greg-payne>…
> In the past, I’ve decided to “make my own documentation” by reverse
> engineering the kernel. Although morally opposed to reverse engineering
for
> most reasons, I believe that I was provided with no alternative.
Did you really reverse engineer? The terms “reverse engineer”, “decompile”
and “disassemble” all suggest an attempt to recover source code. Many regard
this with suspicion. I am among them. Authors have a good case for
protection from having even small portions of their work copied or adapted
into the work of others.
I also note however, that recovery of source code, or of anything that can
be recompiled or reassembled, is entirely unnecessary if the aim is to study
the executable, derive an understanding and then apply that understanding to
the development of new code that interacts with the studied code.
The terms “disassemble” and “disassembly” are also used in connection with
listing the contents of an executable in a vaguely human-readable form, most
notably to use mnemonics for instructions. In legal terms, the worst that
can be said about these disassemblies is that they may be
translations. Let’s concede this for the sake of argument. Then as I
understand it, they present no problem in private use. If preparation of a
translation and subsequent study of that translation (instead of studying
the original text) helps you gain a better understanding of the original
text, then you are free to sing that understanding to the world, even at a
price - though you must of course use your own words when expressing your
understanding.
> The situation sparks a few points of curiosity: Where does OSR gain its
> expertise? From Microsoft?
In several correspondences, people have noted to me that Microsoft does
provide source code for such things as the NT kernel, at a price and under
conditions. Since some of these people have also put figures to the price
and these figures have roughly agreed, I incline to believe that this does
happen. In itself this is no surprise. Microsoft has a long history of
working closely (though generally not very openly) with other companies to
sort out adaptations of system code, especially for hardware-specific
matters.
However, some have also suggested that OSR is among the companies that have
this access to source code. I don’t say that they do have this access, but
if they do, it would be controversial if this access helps with their
business of consulting and training. Microsoft’s system software and its
publications about that software ought to provide sufficient information for
others to develop reliable software that interacts with Microsoft’s system.
If nothing else, this is a premise on which Microsoft lobbies governments to
outlaw methods of inspecting executable code, even if the intention is to
understand Microsoft’s code well enough to write new code that interacts
with Microsoft’s.
Now, even if programmers outside Microsoft have comprehensive and accurate
publications from Microsoft, they may need assistance to interpret the
information or to make practical use of it. They may benefit from someone
else’s experience or insight. They may go to consultants and trainers.
If Microsoft were in the business of consulting and training, I should think
most commentators would regard it as wrong for Microsoft to offer its
consulting and training customers technical details that are not included in
the official literature. It would be unfair to Microsoft’s competitors in
the consulting and training industry. It would be unfair to Microsoft’s
customers who had purchased the literature in the reasonable belief that it
suffices for the job. Does this unfairness go away if Microsoft provides
extra technical details, such as source code, to a selection of consulting
and training companies? Maybe it does. Maybe it doesn’t. But it would
certainly be a matter for public interest.
Greg Payne wrote in article
<01bd1d9c$696d0ce0$xxxxx@greg.netspace.net.au>…
> Is this misleading advertising? It’s an official Microsoft answer to a
> question
> frequently asked by potential subscribers.
I think there would be little doubt that despite the usual disclaimers about
supplying “as is”, Microsoft has warranted that its documentation is much,
much better than it truly is.
It is important though to observe that in some areas of programming, the
documentation probably does meet reasonable criteria for accuracy and
coverage. I note that in a little exercise concerned with the Windows 95 GUI
Shell, there may have been tons of undocumented interfacing between
SHELL32.DLL and EXPLORER.EXE but in all my investigations and coding I don’t
recall once finding any OLE detail where the code was in conflict with the
documentation.
In systems programming however (and I of course offend at one stroke
numerous applications programmers who have regarded OLE as complex low-level
stuff), the documentation is easily seen as ridiculous by anyone who cares
to make the study. The several hundred pages of VMM
documentation in the Windows 95 DDK must average out to at least one issue
(error, omission, cross-reference problem, etc) on every other page. The
chapter on Synchronisation has several pages with two and even three errors.
On top of this, there are systematic errors whose effect is distributed
across the whole of the documentation (and can’t be pinned down to any one
page). It really is quite offensive that Microsoft passes this stuff off as
adequate, let alone as being everything one should need.
> In an absurd legal system, Microsoft could probably claim that the
binary
> image of the OS constitutes highly obtuse (but nevertheless complete)
> documentation, and that Microsoft engineers’ services had been made
> available for programmers wanting to delegate this “documentation
> interpretation” part of their workload.
In an absurd legal system, maybe. However, Microsoft must by now be on the
record (through submissions to government) in just about every western
jurisdiction as saying, more or less, that programmers have no business
whatever to treat executable code as obscure text that they should be
permitted to read.
Greg Payne wrote in article
<01bd1da9$94de0a20$xxxxx@greg.netspace.net.au>…
> Maybe “reverse analyzing” or “reverse designing” is a better term. I
merely
> wanted to retrieve programmatic interfaces in situations where
documentation
> was letting me down or steering me in wrong directions.
It’s pretty clear to me what you wanted - but I am proficient at certain
types of programming and more aware than many of just how littered is
Microsoft’s system documentation with errors and omissions. Sadly, the
general public, lawyers and the politicians who end up having to take views
on the issue tend not to understand the programming issues - while on the
other side, companies such as Microsoft have well-oiled lobby machines that
are managing with some success (though not in Europe, thankfully) to paint
all non-experimental software analysis as one step short of ripping off
their copyright. We need some distinct term for the sort of analysis that
involves a study of a program’s instructions but which does not actually
reverse the engineering.
> > I also note however, that recovery of source code, or of anything that
can
> > be recompiled or reassembled, is entirely unnecessary if the aim is to
> > study the executable,
>
> Of course. Stepping through with Soft-ICE provides a better
understanding of
> a more precide area than manually recommenting a massive disassembly
output
> in the hope of stumbling across the minute portion of code you’re
interested
> in before your deadline expires. …
Each to his own methods, I guess. The big problem with stepping through is
that it concentrates on the execution paths taken in the tested
configuration and risks missing others. I couldn’t regard as reliable any
information deduced this way (but that doesn’t mean a lot since I am so
stubborn and pig-headed that I couldn’t regard as reliable any information
deduced by anyone other than me).
An interesting thing about stepping through is that as far as I have ever
seen, not even Microsoft asserts that one violates copyright by stepping
through system code in an effort to debug a problem in one’s own code. When
Microsoft lobbies governments to proscribe reverse engineering, I expect
that Microsoft stays well clear of asking when does stepping through code in
a debugger become reverse engineering.
In areas of programming - especially system programming as done by
participants of the VxD and NT kernel-mode newsgroups - where the
documentation becomes vague or ambiguous, is the programmer really required
by copyright (of all things!) to take a guess, write some code, sort out the
problems in the debugger, over and over? If the programmer has to face the
problems in the debugger, is it really the intention of copyright
legislation that he must not use that debugger as a tool for a wider study
the code in advance of all the guesswork? I suspect not.
> > However, some have also suggested that OSR is among the companies that
> > have this access to source code. I don’t say that they do have this
> > access, but if they do, it would be controversial if this access helps
> > with their business of consulting and training.
>
> It would be a bit contraversial. Now seems like a good time to remind
you
> of Jamie Hanrahan’s words in an earlier branch of this thread:
Well, my news server is a bit sluggish sometimes, so I didn’t have the
benefit of this:
> “I don’t know where OSR’s information comes from, but all of our effort
> is done with close cooperation from the NT kernel team. They know
> that we are helping people move to NT and so they have provided a
> great deal of support. We have spent hundreds of hours in Redmond
> browsing NT source code and walking across the hall to talk to the key
> developers.”
I do find it a bit unsettling. Jamie may be happy to have a “great deal of
support” from Microsoft, all the way to “browsing NT source code”, but where
does this practice leave NT system-level programmers in general?
On a small scale of course, we probably have here some individuals at
Microsoft (or perhaps a small department) just doing what they can to help.
Their effort should probably be applauded.
On a corporate scale however, what we have is an unspoken recognition of an
inadequacy with the published documentation. Surely the documentation should
be all that the practising programmer requires in order to develop an
understanding of the system sufficient for confident programming. If there
is any benefit to be gained from allowing inspection of source code, then
why is that benefit not instead imparted through better
documentation?
Remember that according to Microsoft, in several sources, the
documentation of system interfaces is sufficiently good that programmers
should never need to “reverse engineer” for the purpose of learning
information required for interoperability. If this were true, we would not
see Jamie (or anyone in his position) talking of any real benefit from
having been shown the source code.
This brings me to a personal reason for being unsettled. In my areas of
specialty in VxD Programming, it’s unthinkable that browsing Microsoft’s
source code would bring to me any technical details that I care about.
(Sure, for curiosity, I’d like to know a few things about why such and such
was done - for instance, just which companies wrote those VxDs that need the
VMM to patch them at run time?) Besides, I place rather more value on
knowing what the code really does, not what Microsoft’s programmers thought
they were writing. Now Jamie is clearly a bright guy with deep experience in
NT, but I’m hardly inspired by the thought that research in NT may be so
thin that people are still finding it beneficial to see the source code. I
hope it’s not that NT is too big or complex to yield to external study.
Peter Desnoyers wrote in article
…
> If you’re that paranoid about other people being able to examine your
> work, mass market software is probably the wrong field for you. If
> you sell me a copyrighted work, the law keeps me from copying any
> portions larger than what is allowed under the fair use doctrine.
> However, it certainly doesn’t prohibit me from reading it.
I’d have thought that my subsequent paragraphs made as explicit as possible
my belief that nothing prevents one from reading another’s text. I guess I
did get boring after my first paragraph, but for some reason I feel I have
been misunderstood here.
What I believed I was expressing is that attempts to recover source code may
go beyond reading, and even beyond a reading that proceeds through private
translation. If restrictions on “reverse engineering”, “decompilation” and
“disassembly” are lawful, they may apply to attempts to recover source
code. They certainly don’t apply to attempts to read executable code, even
through a process of translation that doesn’t actually reverse the
engineering process.
Certainly, if in my study of another’s executable code I never produce any
intermediate translation that even looks like it could be re-engineered to
produce an adaptation, then I cannot be said to have come even close to
giving a copyright holder reasonable cause for concern. I, personally, have
reservations that the same certainty applies if my study involves the
attempted generation of source code.
I would not like to see legislation that proscribes the generation of source
code as a method of preparing a translation that helps one read text in a
foreign language. However, I am human enough to recognise that because this
particular method is a necessary tool for the copying or adaptation of a
copyright holder’s work into someone else’s product outside of fair use, it
alarms those who hold copyright on software.
> … the law keeps me from copying any
> portions larger than what is allowed under the fair use doctrine.
May I stress here that the fair use doctrine does not allow the copying or
adaptation of even small portions into another text without permission from
the copyright holder unless certain tests are met. There are approved
purposes such as criticism and review, reporting of news, and some admitedly
abstract notions such as scholarship. It is also important to note that fair
use will not generally succeed as a defence unless attribution is given - no
matter how small the copied portion.
So, for instance, when Walter Oney recycles in his book some paragraphs from
a Compuserve message that I wrote to him in 1994, the fair use doctrine does
not save him even though the portion is small and he introduces the matter
as criticism or review. He presents as his analysis something that he has
clearly adapted from my explanation to him - and there is no attribution in
sight. The law does protects me as the original author. However, distance
and cost (and my fear of sounding too petty) protect him from the
consequences of his plagiarism.
In software, viewed as any other text, it would generally be wrong to
reverse engineer someone else’s code in the sense of generating an
approximation to the original source code and then to extract and adapt for
inclusion in one’s own code. It doesn’t matter how small the portion. Fair
use won’t apply unless there is attribution and the new program is
believably using the original code for the approved purposes.
> In a case like Windows NT, where access to undocumented interfaces is
> necessary to compete in various markets, a strong case can be made
> that reverse engineering to discover these interfaces is actually
> protected, even in the presence of license agreements prohibiting it.
> In some jurisdictions (Europe, I believe) this protection is explicit.
Well, yes, I didn’t say otherwise. I did ask however what exactly is reverse
engineering? I have done possibly more than anyone else on the planet to
develop and apply skills at reading Intel opcodes for the purpose of
understanding Microsoft’s operating systems. Yet not since my first
experiments in 1989 have I even come close to reversing any engineering
process.
If Greg Payne says necessity compelled him to reverse engineer despite his
reservations about the ethics, I don’t doubt the necessity but I do raise
the question of whether he actually reversed any engineering. The chances
are quite high that he didn’t and by pointing this out, I thought I might
spare him from his trouble with ethics.
Peter Desnoyers wrote in article
…
> In particular, there seems to be an assumption in your posts that
> since a decompilation into higher-level source form could be used to
> violate copyright, it is therefore a priori a copyright violation, or
> the moral equivalent of one.
There was no such assumption. I said that I am among many who view with
suspicion an attempted recovery of source code. I didn’t say that the
attempt should be unlawful.
By differentiating techniques of analysis, I did try to ease the mind of a
correspondent who expressed an ethical reservation about his own reverse
engineering. Some methods of analysis that are often described as reverse
engineering - and which I believe are the ones most often used by
programmers who try to uncover accurate information - have the merit of
presenting no reasonable ground for suspicion.
Those alternative methods also have the merit of being superior: I have
never met anyone who insists on recovering a representation in source code
and whose deductions are worth reading.
> If one technical means (recovery of
> assembler source - i.e. the compiler intermediate) is OK, then why not
> another (recovery of an equivalent to the compiler source)? Who
> decides?
I certainly made no distinction. I regard an attempt to recover assembler
source with as much suspicion as an attempt to recover compiler source.
Jamie Hanrahan wrote in article
…
> From everything I have heard, there is no evil intent here
No doubt this should all be left alone, but I am curious about something:
where does the suggestion of evil intent come from?
> – it’s a
> question of available skills.
It’s a question of whether Microsoft delivers an operating system that is
documented well enough for programmers on the other side of the interfaces
to write interacting code reliably and confidently. Does the documentation
meet the claims that Microsoft makes of it, whether in advertising the
documentation to programmers, in telling other programmers’ customers that
comprehensive documentation is provided, or in lobbying governments that the
documentation is good enough for programmers to never need to conduct
reverse engineering?
If the answer is no, then to say there’s no evil intent is neither here nor
there, and although the availability of resources may help in an assessment
of reasonable expectations of quality, it cannot excuse a mismatch between a
product and the claims that its manufacturer makes of it.
–
Geoff Chappell
Software Analyst