Saturday, November 14, 2015

Reverse engineering and the security of software

Erratum: I apologize if somebody is left with the impression that the "pond model" is of my invention.It is not. I cannot trace the history of it, but here is a reference for it:


Large portions of this blog post was written in April (or there abouts). I’ve only updated it in some places and seriously reworked a comment on bug bounties. It was part of a larger blog post that I wanted to make and never finished. It just kept on growing. The blog post would’ve been called “Perverse infosec incentives” and is about the totally messed up incentive structure present in infosec sector today. Any micro economist would have a field day applying standard micro economic theory – especially the industrial organization subsection of micro economics and analyze just how badly infosec economics works. The other sections that I’ve not finished and probably never will, have the following headers – I’m sure most of you guys can fill in the blanks from reading just the headers.
Why exploits are weapons
Why #yourboyserge proves that anti-virus works and why I am free riding
Why anti-virus companies might be better off showing false positives
Why does public available information about offensive security dominate defensive
Why would it be a good idea for an infosec company to blog
Peak malware

You may ask why I’m posting this now and haven’t done so earlier. Well – one reason Is that I was beat to bringing up the subject. I believe Haroon Meer was responsible for that, but I’m not sure. Any ways I’m glad that it is brought up, whoever did it. The other was I actually halfway decided against trespassing on info sec politics terrain. My reason was I wanted to keep my blog about technical stuff. Eventually there was a bit of mission creep with the “Nostalgia posts” and I think I’ll just let it creep some more. I blog for fun, why should I care about mission creep? Finally came a tweet from Marion Marschalek that read: “Infosecs first problem in policy work is its heterogeneous community that doesnt get its plenty opinions together. Free after @halvarflake“.  All this made me think that even though these things have been said before and almost certainly better that I can, maybe if enough people join the choir somebody will listen.

Reverse engineering and the security of software

It is often that reverse engineering software is disallowed in license agreements. Sometimes it’s even argued by software manufacturers that reverse engineering software should be generally outlawed to protect their intellectual property. I shall argue here that reverse engineering for information security is in fact important for all information security and software quality in general using classical arguments from the Industrial Organization branch of economics. This blog post is not meant to be rigorous, but to introduce some useful economic concepts.  The release of this blog post was motivated by discussion on reverse engineering and the relevance of the Akerlof(1970) article for this discussion. Though I agree that this article is indeed relevant, I shall argue here that the while Akerlof’s problem has simple solutions those solutions do not work well for the software security problem and that reverse engineering does provide a solution.

You would obviously be right if you immediately though that economics doesn’t say anything about security. But it does say something about quality and a product being secure would be considered a quality by most customers and by a society as a whole it certainly should. Thus when I talk about software quality I’m talking about how secure it is in this text. There are other aspects of software quality which I’ll ignore here even though the same argument extends themselves to other measures of quality. I’ll use the quality lingo to stay in tune with classic economics.

Akerlof(1970) made an economic model for the market of used cars. His model has two kinds of used cars: cherries and lemons. Consumers are assumed not to be able to identify if a car was in great shape (a cherry)  or in bad shape (a lemon – after your facial expression once you find out you bought one). However the car dealer / previous owner would know. Say the value of a lemon is $600 and the value of a cherry $2000. With every other car being a lemon the average value would be $1300 and a rational; risk neutral consumer would thus be willing to pay this for a car. Obviously no dealer would sell a cherry for this price and the market would break down.

Akerlofs example illustrates what economics calls imperfect information. The parallel with software security is that normal people are perfectly incapable of detecting if software is easily exploitable or not, in the same way people like me couldn’t tell a cherry from a lemon. But this is where the parallels start to break down. The solution is easy in the used car market because a dealer warranty goes a long way to realign incentives – nobody wants to give a warranty on a lemon. Vendors could also start selling only cherries to build reputation. No such simple solution exists for Software.

Economist sometimes distinguishes between 3 kinds of products in relation to quality and imperfect information. The first is the “search good”, I don’t know which laptop I’d like to buy, but if I research it long enough I’ll figure it out. The second kind of good is the “experience good”. You find out the quality of the product after you buy it. Drive a car for a couple of years and you know if it is a lemon. The last kind of good is the “credence good”. Chances are that you’ll never know if your tooth paste contains the right amount of fluoride for your teeth.

I argue that software is a “credence good”. That the software consumer is unlikely to ever know if software was of good quality and thus secure. The thing with software exploits is that they often leave little or no trace, so even if you do get hacked you’d often not know how it happened – even though you may very well figure out that you were hacked. This gives software vendors a perverse incentive not to invest in software security – consumers cannot identify it in any way and it’s likely to be costly, thus making software vendors investing in security less competitive. We would predict only a partial market break down for software – everybody knows that there is no cherry software because there is no reason to produce it. Because of the “credence good” nature of software – that is the fact that consumers never know the quality – there is no easy way out of the problem through warranties or liability. Nor can vendors build reputation by only selling cherries. In this oversimplified model we’ve institutionalized software insecurity.

We need to address the problem of software insecurity at it’s root: The imperfect information. Restaurants are reviewed, government agencies and scientists spends lots of time on figuring out how much fluoride should be in tooth paste and sometimes even how much fluoride is in a given tooth paste. Cars under go technical tests, etc. Why should this be different for software? Even if only a few customers are informed it’ll increase incentives to make quality products to the benefit of all. By being more demanding they drive up quality on the products. Or in economic terms: “The informed customer exert a positive externality on the uninformed ones” – Tirole(1994).

The litmus test for the security of a software product is how easy it is to find insecurities in it and the way this is done is by reverse engineering parts of the software looking for security relevant errors. Even when the results of a reverse engineers work , is also not understood by the broad public, the presence of informed customers improves quality for everybody. Sharing the information only helps the process.

We should not make any assumptions that hackers with ulterior motives will abide by license terms or even laws. And while open sharing information on inadequate quality of software may give rise to opportunities for hackers with less noble motives, in the long run institutionalized insecurity will prove worse that occasional glitches. 

Now the model I proposed here is obviously over simplified. Sometimes hacks can be attributed to bugs in specific software. Mikko Hypponen(2011) argues that a major security focus by Microsoft was a direct result of Code Red and it’s clones causing havoc to Windows Systems. To me that is evidence of the mechanism actually working. Microsoft gathered that consumers would identify their lack of quality. Also people tend to be not to be of the Homo Economics species, but rather homo sapiens with much more diverse behavior. I do however think that my model is an important part of reality.  Fortunately some tech companies see it similarly and companies like Microsoft, Google and F-Secure either pay for security bugs found or keep a hall of fame of people having reported security bugs found by reverse engineering.

Reverse engineering for security purposes should be embraced by anybody who favors information security.

The not so depressing effects of bug bounties

Resonantly Jacob Torrey wrote a blog on the “Depressing Effect of Bug Bounties” Torrey(2015). Mr. Torrey argues that the work of reverse engineers to find bugs in response to bug bounties will give companies incentives to lower their own Q&A efforts. He argues: “By artificially deflating the cost of finding and fixing bugs in operation/shipped product through monopolistic means, bug bounties remove the economic incentive to develop better software by integrating security-aware architects into the SDLC. Bug bounties use their monopoly on setting prices“. I’ll grant Mr. Torrey that such effects are bound to happen, but I still come out in favor of bug bounties.

My first quarrel with Mr. Torrey’s argument is that bug bounties are not that monopolistic. Anybody wanting to claim price money from bug bounties are likely to pick the bug bounty (and thus product to reverse) to where she is most likely to achieve the most money for the least effort. Indeed other alternatives such as selling bugs to companies that trade bugs for weaponization exists. There is little argument that the market for “bug bounty hunters” has an inelastic supply allowing software companies to dictate market conditions.  What I mean is there is little reason to think that “bug bounty hunters” cannot pick and choose as they wish between different bounties.

My second contention is that bug bounties isn’t new. Microsoft has had a “hall of fame” for security researchers reporting bugs since 2007. Microsoft(xxxx): Now this isn’t a monetary reward – but people in the infosec community does collect this kind of recognition to find a well paying employer. Thus it’s economically equivalent of a bug bounty –it’s value for service.

However bug bounty prices at the moment are ridiculous. F-Secure offering a mere EUR 100 – EUR 15000 for a bug report and they alone decided what the actual pay day is going to look like. F-Secure(2015). 100 EUR is about 2 hours worth of brick laying in Germany, you can’t get a decent engineer to write a report for that amount of money, let alone research anything. It’s certainly is a big differences to the prices acquired on the weaponization market, but at least the white hat side of things offers up competition and that’s a good thing. I believe I read somewhere that $45.000 was paid by hacking team for a flash exploit. I think these prices reflect that bug bounties are not yet taken serious by some companies rather than monopolistic competition.

Finally bug bounties carry the possibility for companies to use them as a signal of quality. Offering a million dollars for a bug says a lot about a company’s confidence in their software’s security. It goes a long way to solve the imperfect information problem. It is in many ways equivalent to the warranty solution in the market for cars. That reverse engineering is allowed is always a prerequisite for bug bounties to work though.

The fact is bug bounties incentivize bug hunting in a white hat manner and the depressing effect on in-house q&a mr. Torrey argues for is likely to be small – especially if bug bounties becomes wide spread and a fair evaluation system can be applied. Further most bug bounties only suppress information on bugs for a period of time thus allowing for information on product quality to arrive at the market. Is EUR 15.000 is a strong signal of confidence in F-Secure’s security?  I doubt it.

Despite my “disagreement” with Mr. Torrey’s article, I find it refreshing to see such blogs about infosec. It’s good work and it deserves a read .

The pond argument

It is sometimes argued that hunting for exploitable bugs is like catching frogs in a pond. If there is many frogs in the pond, removing one will have a negligible effect on the frog population and thus the effort required to catch another. If there are only 3 frogs in the pond, you’ll eradicate a third of the population making it much harder for other (more dubious) people to catch a frog. It’s a wonderful argument and I find it has lots of bearing on exploits. I think there is lots of exploits in the pond and thus finding a 0-day doesn’t contribute significantly to security in general.

However the pond argument ignores the imperfect information argument of reverse engineering above. Having bugs found too often implies insecure software. Flash and Font bugs has gained that reputation. In fact the many security bugs in flash have become a real problem in the market place for Adobe. The number of bugs that have been reported on flash is often used as an argument to turn it off and replace it with something else. That is an argument that I fully endorse.

However my guesstimate remains that software authors produce more security relevant bugs, than security researchers can find. Making errors is human and programming is for now a very human task. I shall not deny that I’m responsible for my share of bugs. The attack surface (pond) is expanding faster than security researchers can keep up. The infosec community would be wise to look for methods for avoiding bugs in the first place (google “langsec” to see what I think is the most promising path) or at least solutions that offers significantly hurdles to utilizing bugs for security breaches.

The real problem

While bug bounties, in my view certainly, is a step forward, the core problem of the insecurity state of software development is liability. Companies that gets hacked does not carry the cost of it. Sony spend $15 millions on “investigation and remediation costs“ according to their Q3 2014 financial statement.  In essence Sony gave everybody who’s data was dropped a credit monitoring subscription and that’s it. The trouble for those whose private data actually get abused is that they are never reimbursed in any meaningful way. More damaging information leaks such as those that happened with the Ashley Madison hack, significantly damage those involved, but there is no compensation and thus only a fraction of the societal cost is billed to the hacked company. This is called moral hazard in economics terms, because the decision on how much to spend on security is divorced from who carries the cost in the event of a hack. The problem of moral hazard in infosec is two fold: First, companies has little incentive to invest in security and this little incentive is bound to be passed down the food chain to software developers. Second, companies has every incentive to store any and all information making hacking a much more prosperous venture. I based this portion loosely on a wonderful article in “The conversation”. You can find it here:

Tirole(1994):  The Theory of Instrial Organzination.
Akerlof(1970): “The market for Lemons: Quality uncertainty and the market mechanism”; Quarterly Journal of economics 84
Hypponen, Mikko(xxxx): “The history and evolution of Computer Viruses”. DefCon 19.
Torrey, Jacob(2015): “Depressing Effect of Bug Bounties”;
Microsoft(xxxx): “Security Researcher Acknowledgments”



  1. (Moving from twitter). Regarding your pond argument, you went in the right direction with relating to animal populations, but I believe the more accepted practice of estimating bugs (or vulns) is the "capture-recapture" method of estimating populations. I think this was first proposed in an I M Wright "Hard Code" article (which I unfortunately can't find it), but a similar write-up is at:

    Parts of that don't carry over perfectly, because you aren't having 2 people look at small, specific parts of a code-base at the same time in order to count overlap in bugs found, but in general, it follows in real-world examples, that code with many bugs found is likely to have many more as yet undiscovered bugs.

    Regarding bug bounties, often companies that are willing to do this, are also taking other pro-active steps to search for vulnerabilities in their code base. For example,this likely involves paying for automated scanning services that are continuously running against the site 24/7 to hopefully detect a mistaken misconfiguration before bad guys (likely running the same tools) do. That's one end of the spectrum. On the other hand, you likely contract out private audits to companies, potentially handing out some source code and having meetings between your engineers and their auditors. These are likely to cost more than the highest bug bounty payouts and also will find the most difficult to find bugs, but you probably only do this once a year or for major product releases. In between these are the bug bounty's which are black box testing and largely this is way to outsource some continuous (24/7) security auditing to cheaper labor markets in between your private security audits.

    If you're upset at the low pay-out of bug bounties and believe your skill set is worth more, it may because your skill set is worth more and therefore you aren't the target market that bug bounties, and therefore should instead approach these companies with a quote to audit them privately.

    Regarding "the real problem", yes, this is correct, companies unfortunately are not penalized by the markets or legislation to improve their security. TalkTalk is a recent counter-example as it's hack actually did negatively impact the company's stock price (~24%). Why TalkTalk's stock dropped and not Target, Sony, or any others' is interesting to consider. Maybe they didn't have cyber insurance? Maybe the laws in England are different than the US?

    1. 1. I apologize if I’ve made the impression that any of the models/arguments mentioned here are of my origin. “The Pond” argument suspiciously misses a reference and that’s just bad style. Mea culpa. I’m not aware of any other linkage between the model of lemons, “Credence goods” and bug bounties, but I’ve not searched for it either.
      2. The way I use the “pond argument” I speak of all security bugs in all software as the population. I’m sorry if I did not make that sufficiently clear. I don’t wish or try to estimate the number of exploitable unknown bugs. I just opine that the population is large. My argument is that I as a software developer produce a relatively high number of bugs compared to what the 0-day community finds. I don’t pretend to be the world best coder, nor the worst. I could’ve used other lines of argumentation such as the emergence of Internet of Things, the ratio between number of people making a living with finding 0-days compared to the number of people that develop software, that no significant drop has been noted in the actual amount of CVE’s year over year have been seen etc. I did not for brevity. There is also evidence to the contrary which I left out for the same reason. It occasionally happens that 0-day hunters comment that they had found the same bugs as others which would be an indication of a small amount of bugs in the “pond”. I remain committed to my opinion that the population is large.
      3. My intention with bringing the “pond model” into the text was to play down the importance of bug bounties. And the argument I wanted to bring was that bug bounties is in my opinion is a step forward, but that it’s not the solution. Despite my critique of Mr. Torrey, that brings me to some of the same conclusion that he reached and I probably should’ve made that more clear. I do spend a few lines being critical of the “pond model” for not including the effects of imperfect information I introduced in the first part of my blog post. The story of the “pond model” is much longer. I didn’t want to go there because I didn’t want start piling on – keeping focus is a problem for me ;). While I think 0-day hunting cannot provide security over all, because the population of bugs is just too big, there is no reason why 0-day hunting cannot be useful in “local ponds”. In fact I think the Microsoft code red example I mention could be a case where code red and it’s clones contributed directly by thinning out the number of exploitable bugs in RPC, LSSAS modules (I admit I don’t have enough data to support that that actually happened – but the speculation is valid I think).
      4. Regarding bug bounties – you are absolutely right that software vendors take other measures than bug bounties. The point of my analysis is that the amounts of measures they take depend on how much they improve their market position by doing it. Now it would be natural that anybody who wishes to signal high quality of their software with bug bounties, to also actually raise the quality. And that argument runs in the other direction too –bug bounties as a signalare credible because if the quality isn’t high bug bounties will be expensive and might actually be a signal of poor qualities because the hackers lining up to claim the bounties will become informed of the inferior quality. This line of argumentation goes back to the “lemons model” of the first part of my post. Thus though I do not actually write it, I’d expect to find positive correlation between vendor efforts for a more secure software and the size of their bug bounty. That would be a testable real world implication of how good my model describes reality and thus you could (at least in theory) go from speculation to actually testing the model.
      5. The TalkTalk case is very interesting. I do not have sufficient knowledge to comment on it sensibly. If you got any links or other data I’d much appreciate if you share it.

      So thanks very much for your comments. I hope this clarifies my views for you – if not I’m all ears.