IEEE Vic: Wed 27 18:00- Bayesian Decision Making in the Real World

Rodney Brown

25 Apr 2016 25 Apr '16

11:37 a.m.

*Bayesian Decision Making in the Real World* */Presented By : Dr. Kevin Korb Date : Wednesday April 27, 2016 Time : 6:00pm - 7:30pm Location : /*/Room 80.03.021, Swanston Academic Bldg, 80/445 Swanston St, MelbourneVIC3000 (Map <http://ieeevic.us4.list-manage1.com/track/click?u=ff449fc8ee84b59411cf34a0d&id=6c36c10e00&e=8e2288f491>)/ *------------------------------------------------------------------------------------------------*----- */Abstract:/* Bayesian networks, extended to Bayesian decision networks (BDNs, aka influence diagrams), support decision making under uncertainty in an ideal (normative) way. That means they can provide guidance to people prepared to learn how to build and test them and then meticulously analyse a decision problem and tailor a BDN to it. In other words, BDNs are used by almost no one for real decision making. In this talk I'll outline what's involved in all of that in case you really want to do it (or just understand what's involved). But for the vast majority of people there's a better way: pay (directly or indirectly) someone to build a GUI front-end that will hide the details from the user. For any particular decision problem there will be invariants that can be pre-built into a BDN. All that a user/customer need do is enter the user's specific context: general background info plus the needs and preferences that specific user has. The BDN can then assess the different choices available and their user-specific value. I will compare and contrast this with what's most commonly available, CHOICE-like comparative tables of costs and benefits of each alternative, which puts the entire burden on the customer, leading to highly suboptimal decisions. Bayesian decision making is the wave of the future, so we may as well catch it now. */Speaker: / Dr Kevin Korb*is a Reader in the Clayton School of Information Technology, Monash University and co-founder and Director of Bayesian Intelligence Pty Ltd. BI delivers training and Bayesian network modeling solutions to business and government. Korb received his PhD in philosophy of science from Indiana University in 1992. His research interests include Bayesian philosophy of science, causal discovery algorithms, Bayesian networks, evolutionary artificial life simulation, and the epistemology of simulation. He is the author of "Bayesian Artificial Intelligence" (CRC, 2010) and "Evolving Ethics" (Imprint Academic, 2010), co-founder of the journal Psyche, the Association for the Scientific Study of Consciousness and the Australasian Bayesian Network Modelling Society (ABNMS) and Chair of theIEEEComputational Intelligence Society inVictoria. *This is a free event open to the public. for catering purposes, please register at:* *http://bayesian-decision-making-in-the-real-world.eventbrite.com.au/ <http://ieeevic.us4.list-manage.com/track/click?u=ff449fc8ee84b59411cf34a0d&id=ef7b56b40e&e=8e2288f491>* *======================================================================================* *Kind Regards Annick Boghossian* *The Computer Society Chapter ofIEEEVictorian Section Chair*

Attachments:

attachment.html (text/html — 6.1 KB)

Show replies by date

Rick Moen

25 Apr 25 Apr

3:38 p.m.

New subject: IEEE Vic: Wed 27 18:00- Bayesian Decision Making in the Real World

Quoting Rodney Brown (rdbrown@pacific.net.au):

...

*Bayesian Decision Making in the Real World*

[...]

...

*/Abstract:/* Bayesian networks, extended to Bayesian decision networks (BDNs, aka influence diagrams), support decision making under uncertainty in an ideal (normative) way. That means they can provide guidance to people prepared to learn how to build and test them and then meticulously analyse a decision problem and tailor a BDN to it. In other words, BDNs are used by almost no one for real decision making. In this talk I'll outline what's involved in all of that in case you really want to do it (or just understand what's involved).

But for the vast majority of people there's a better way: pay (directly or indirectly) someone to build a GUI front-end that will hide the details from the user.

Interesting idea (and apparently the basis for the speaker's shiny new business that he'd like to promote). But Bayes's Theorem isn't really all _that_ difficult for laymen to apply in the real world without software front-ends. Let me quote an example I posted to a different mailing list this past February 17th (just as I was arriving in Sydney on cruise ship Crystal Serenity): I'm just starting to catch up on old threads, having needed to keep Internet usage sparse during our ocean crossing. A week ago, seeing just the Subject header, I'd wondered if this thread were about the accuracy of surveillance-type facial-recognition by machines. I see it wasn't, but expect it's OK if I digress onto that. So: Facebook, Google, Twitter, and such companies with huge collecitons of other people's tagged digital photos are monetising them. (Facebook's collection comprises someething like 13 _trillion_ photos.) FBI has a database of 52 million faces, and describes its integration of facial recognition software with that database as 'fully operational'. The agency's director claims its database wouldn't include photos of ordinary citizens, though this is demonstrably contradicted by its own documents (https://www.eff.org/deeplinks/2014/04/fbi-plans-have-52-million-photos-its-n... -database-next-year) . Everyone appears to be rah-rah about how successful this is going to be in every possible application, if not today in year n, then surely in year n+1 -- and indeed in some applications it works well enough. However, when I heard that DHS [USA Department of Homeland Security] seriously expected to use automated facial recognition as the reason to detain Bad People in airports and elsewhere (the 'FAST program' - Future Attribute Screening Technology, started in 2012), I thought 'Guys, you've never heard of the base rate fallacy, have you?' Or, to put it another way, DHS is yet another institution needing to learn Bayes's Theorem. Base rate fallacy is the fallacy of ignoring the probability-skewing effect of a low base rate. I will explain: For the terrorists-in-airports example, that would be the probability that any random person walking through an airport is actually a terrorist. Let's say an example airport has 1 million persons walking through it in a year (it's a small regional), and it's very popular with terrorist such that we expect 100 terrorists to walk its halls in that year. So, the base rate of being a terrorist in the scenario is 0.0001. The base rate of being a non-terrorist in the scenario is 0.9999. DHS gets the 'FAST program' going at the airport, and stocks its database with super-studly spook-approved photos. And DHS claims the software is really, really good! 1% error rate! Specifically, it says: o Actual terrorists fail to trigger the klaxon 1% of the time (false negative). And... o Non-terrorists trigger the klaxon 1% of the time (false positive). (These are invented example numbers of mine, but I think within a realistic ballpark.) DHS sends out a press release reporting glowingly positive results, because the system is '99% accurate'. But what does '99% accurate' really mean in this context? It merely means a low error rate, not high accuracy. The accuracy is actually piss-poor, because, observe: 9,999 non-terrorist travelers during the studied year got slammed up against the wall by the brute squad -- along with 99 terrorists, for a total of 10,098 klaxon soundings. So, the probability that a person triggering the alarm actually is a terrorist, is only about 99 in 10,098, which is 0.98% accuracy. I call _accuracy_, here, the probability of terrorist given klaxon, whcih we'll call 'p(terrorist|K)', where p() means probability of, and the | character means 'given'. Bayes's theorem says: p(terrorist|K) = p(K|terrorist) times p(terrorist) divided by p(K). p(K|terrorist) is 99 / 100 = .99000000 (1% false negative) p(terrorist) is 100 / 1000000 = .00010000 p(K) = 10098 / 1000000 = .01009800 Probability of terrorist given klaxon is thus .00980392 or only 0.98% accuracy -- less than 1% accurate, though I have little doubt DHS would call it '99% accurate' (ignoring the low base rate). And the point is, this sort of fallacy occurs _all the time_ when people talk about probilities and rates of success for infrequent events and large amounts of data. Quoting Dave Palmer:

...

http://arstechnica.co.uk/security/2016/02/the-nsas-skynet-program-may-be-kil...

Quoting from the article: The 0.008 percent false positive rate would be remarkably low for traditional business applications. This kind of rate is acceptable where the consequences are displaying an ad to the wrong person, or charging someone a premium price by accident. However, even 0.008 percent of the Pakistani population still corresponds to 15,000 people potentially being misclassified as "terrorists" and targeted by the military -- not to mention innocent bystanders or first responders who happen to get in the way. Once again, classic base rate fallacy. The 'failure rate of 0.008%' figure is totally wrong. "On whether the use of SKYNET is a war crime, I defer to lawyers," Ball said. "It's bad science, that's for damn sure, because classification is inherently probabilistic...." Worse than that, it's it's classification on grounds of mathematically incompetent calculation of that probability. A jury of random citizens in 1996 UK murder trial Regina v. Denis Adams (https://en.wikipedia.org/wiki/R_v_Adams) learned to correctly apply Bayes's Theorem -- but their usage of Bayesian inference was ultimately overturned by an appellate judge (http://www.bailii.org/cgi-bin/markup.cgi?doc=/ew/cases/EWCA/Crim/2006/222.ht... - a horrific scientific blunder on judge's part, IMO): Quoting Greg B (cyclopasaurus@gmail.com): [I only barely remember this case, but:]

...

A proper Bayesian approach to the evidence should have lead the jury to at least a reasonable doubt conclusion, but the calculations are both involved and require counter-intuitive thinking, and further the last thing the prosecution wanted was a bunch of thinking jurors.

The Appeal Court ruling calling Bayes's Theorem inappropriate in the courtroom was appalling, and is the sort of thing jurists are going to be embarrassed about in the future. The jurors' scenario was a classic situation where frequentist notions of probability was IMO even more idiotic than usual. (Fortunately, my understanding is that the Appeal Court judge's comments on that matter have no binding power for the future.) Here's the nub of the problem, as quoted from the magazine article quoted in the Wikipedia article's Notes section: However, there was very strong DNA evidence linking him with the crime and when the case came to trial in 1995, effectively the only incriminating evidence was that his DNA profile matched the DNA evidence found at the scene of the crime. The prosecution forensic scientist had calculated what is called a match probability, that is, the probability that if you pick someone at random, their DNA would match the DNA sample of the assailant. That, according to him, was 1 in 200 million. It's tempting for people not trained in statistics to get muddled, and to confuse two different probabilities. The first is the probability that a person would match the criminal's DNA profile given that they are innocent. The second is the probability that they are innocent given that they match the DNA profile. The forensic scientist's 1 in 200 million refers to the first probability. But jurors may wrongly think that this is the probability that the defendant is innocent. This misunderstanding is called the prosecutor's fallacy, and can be extremely prejudicial to the defendant. http://onlinelibrary.wiley.com/doi/10.1111/j.1740-9713.2005.00089.x/epdf The forensic scientist (testifying for prosecution) said that, if Denis Adams is innocent, there's only a 1 in 200 million chance of his DNA matching that of the real murderer. In other words, there was probably maybe only a few other men in all of the UK who might have matched. Which is actually pretty suggestive of guilt, but would have to be considered in contrast to all the other evidence (all of which pointed the other way). However, what the jury tends to hear the forensic scientist say is that given that Denis Adams matched the DNA sample, there's only a 1 in 200 million chance he's innocent, which is a completely different statement _not_ supported by the testimony, and very prejudicial against defence. Quoting Laurie Forbes:

...

I don't understand why "friend of the court" expert witnesses are not used more frequently in these types of cases.

The article I cited to was in fact written by an Oxford statistician called in to advise the court (and jury) on the original trial. He taught the jury how to do Bayesian inference -- which later was ignorantly deprecated by the Appeal Court judge.

Russell Coker

4:30 p.m.

New subject: IEEE Vic: Wed 27 18:00- Bayesian Decision Making in the Real World

Firstly while I think that the IEEE meeting in question is interesting and worth attending, I've booked in for the WEHI Immunology Discovery Tour and the SecTalks Melbourne Meetup on that day, both of which are more interesting to me. But if medical research and computer security aren't interesting to you then the IEEE lecture in question seems worth attending. On Tue, 26 Apr 2016 01:38:43 AM Rick Moen via luv-talk wrote:

...

Everyone appears to be rah-rah about how successful this is going to be in every possible application, if not today in year n, then surely in year n+1 -- and indeed in some applications it works well enough. However, when I heard that DHS [USA Department of Homeland Security] seriously expected to use automated facial recognition as the reason to detain Bad People in airports and elsewhere (the 'FAST program' - Future Attribute Screening Technology, started in 2012), I thought 'Guys, you've never heard of the base rate fallacy, have you?'

Or, to put it another way, DHS is yet another institution needing to learn Bayes's Theorem.

The DHS is entirely based on cowardice, it wouldn't exist if the world was run by valorous people. There is no amount of learning that can make a coward act logically, they are based on fear. I think that the best definition of a coward is someone who acts in a way that increases the risk to themselves because they are so weak and fearful. The best example of this is the gun cowards. The NRA has succeeded in preventing the CDC etc from analysing the risks of guns and the cowards support them all the way. It's proven that having a gun in a home dramatically increases the incidence of successful suicide and of disputes ending in homicide.

...

through it in a year (it's a small regional), and it's very popular with terrorist such that we expect 100 terrorists to walk its halls in that year. So, the base rate of being a terrorist in the scenario is 0.0001. The base rate of being a non-terrorist in the scenario is 0.9999.

Terrorism appeals mostly to losers, how stupid does someone have to be to think that you can detonate C4 or detronator cord (based on PETN) with a match? But because of that loser and the cowards who want security theatre many of us have had to needlessly take our shoes off. If the terrorism masterminds had any mental capacity they would have sent losers with all manner of unusable explosives to increase the range of stupid hoops that we have to jump through at airports to appease the cowards.

...

o Actual terrorists fail to trigger the klaxon 1% of the time (false negative). And...

o Non-terrorists trigger the klaxon 1% of the time (false positive).

(These are invented example numbers of mine, but I think within a realistic ballpark.)

http://gizmodo.com/95-percent-of-fake-bombs-made-it-through-airport- securi-1708318199 No, the DHS, TSA et al are much less successful than that at detecting explosives. Also they steal from baggage and sometimes lose their guns in the secure area of airports.

...

...
http://arstechnica.co.uk/security/2016/02/the-nsas-skynet-program-may-be- killing-thousands-of-innocent-people/

Quoting from the article:

Anyone who names a computer system "skynet" is obviously happy with killing innocent civilians. This is obvious and doesn't even need discussion.

...

Once again, classic base rate fallacy. The 'failure rate of 0.008%' figure is totally wrong.

"On whether the use of SKYNET is a war crime, I defer to lawyers," Ball said. "It's bad science, that's for damn sure, because classification is inherently probabilistic...."

Skynet in fiction is an evil worse than the 3rd Reich. Noting this fact is not a Godwin violation as it's quite reasonable to compare genocides. Skynet in fiction aimed to exterminate the entire human race. The precedent when dealing with Nazis is that following orders is not acceptable as a legal defense. If the US had a functional justice system then the death penalty would be applied to the people deemed responsible. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

Rick Moen

9:18 p.m.

New subject: IEEE Vic: Wed 27 18:00- Bayesian Decision Making in the Real World

Quoting Russell Coker (russell@coker.com.au):

...

Firstly while I think that the IEEE meeting in question is interesting and worth attending, I've booked in for the WEHI Immunology Discovery Tour and the SecTalks Melbourne Meetup on that day, both of which are more interesting to me. But if medical research and computer security aren't interesting to you then the IEEE lecture in question seems worth attending.

On Tue, 26 Apr 2016 01:38:43 AM Rick Moen via luv-talk wrote:

...
Everyone appears to be rah-rah about how successful this is going to be in every possible application, if not today in year n, then surely in year n+1 -- and indeed in some applications it works well enough. However, when I heard that DHS [USA Department of Homeland Security] seriously expected to use automated facial recognition as the reason to detain Bad People in airports and elsewhere (the 'FAST program' - Future Attribute Screening Technology, started in 2012), I thought 'Guys, you've never heard of the base rate fallacy, have you?'

Or, to put it another way, DHS is yet another institution needing to learn Bayes's Theorem.

The DHS is entirely based on cowardice, it wouldn't exist if the world was run by valorous people.

US Department of Homeland Security is an umbrella under which all of these were gathered in 2002: o Immigration and Naturalization Service (INS): Agency that processes the application for citizenship or residency of tens of thousands per year. My father doubtless spoke with INS when he immigrated with his family from Norway around 1933. INS also at first included U.S. Border Patrol at its creation in the early 20th C., which was shifted among numerous Federal departments before being put under DHS in 2002. o U.S. Customs Service: Collected import tarriffs and performed other selected border security duties. o Animal and Plant Health Inspection Service: Worked with other agencies to protect U.S. agriculture from invasive pests and diseases. o Federal Protective Service: Provided law enforcement and security services to U.S. Federal buildings, courthouses, and other properties. o Transportation Security Administration: oversees security for highways, railroads, buses, mass transit systems, pipelines and ports. o United States Coast Guard: maritime homeland security, maritime law enforcement (MLE), search and rescue (SAR), marine environmental protection (MEP), the maintenance of river, intracoastal and offshore aids to navigation (ATON). o United States Secret Service: Federal law enforcement agency investigating and preventing financial crimes, including counterfeit U.S. currency, U.S. treasury securities, and investigation of major fraud. Also handles safety of current and former national leaders and their families, such as the President, past Presidents, Vice Presidents, presidential candidates, visiting heads of state, and foreign embassies. o Federal Emergency Management Agency: Coordinates response to any natural disaster that has occurred in the United States and that overwhelms the resources of local and state authorities. o National Protection and Programs Directorate: Has mission to reduce and eliminate threats to the USA's critical physical and cyber infrastructure. I cannot help noticing, Russell, that when people on the Internet start moralistically preaching and ranting, they tend to switch their brains pretty much entirely off. Thus my catchphrase in tongue-in-cheek praise of such postings: 'It's easier than thinking!'

Rick Moen

11:18 p.m.

New subject: IEEE Vic: Wed 27 18:00- Bayesian Decision Making in the Real World

I wrote: [Snip application of Bayes's Theorem to facial recognition in airports looking for known terrorism suspects]

...

And the point is, this sort of fallacy occurs _all the time_ when people talk about probilities and rates of success for infrequent events and large amounts of data.

[...]

...

Once again, classic base rate fallacy. The 'failure rate of 0.008%' figure is totally wrong.

There are several _very_ common statistical fallacies people commit in situations involving low probabilities. Poor understanding of probabilities and statistics is increasingly becoming a huge problem -- in criminal jury trials, setting of government policy, and elsewhere. o Base-Rate Fallacy: irrationally disregarding _completely_ general likelihood information ('base rate' information) you have about rarity of something, because you've focussed on specific details of a case. (The better alternative is to weigh both types of information appropriately.) o Prosecutor's Fallacy: confusing the likelihood of a suspect satisfying a description with the likelihood of any person satisfying the description being guilty -- i.e., confuses the likelihood of seeing some evidence at all with the likelihood of that evidence indicating guily. o Defendant's Fallacy: confusing extreme rarity of a trait with very low likelihood of someone having that trait being guilty. The Wikipedia explanations of all three are _utterly_ wretched. I'm a mathematician, and *I* find those write-ups confusing and useless. For some reason, this really bothers me. I guess I like clarity. This page, unlike the execrable Wikipedia one, explains Prosecutor's Fallacy really well: http://www.conceptstew.co.uk/pages/prosecutors_fallacy.html This one is pretty good: http://www.agenarisk.com/resources/probability_puzzles/prosecutor.shtml These pages do a so-so job of explaining Base-Rate Fallacy: https://www.logicallyfallacious.com/tools/lp/Bo/LogicalFallacies/55/Base_Rat... http://www.fallacyfiles.org/baserate.html This one explains Defendant's Fallacy pretty well: http://www.agenarisk.com/resources/probability_puzzles/defendant.shtml All three fallacies go away if you _apply_ Bayes's Theorem to proability problems, but hardly anyone will do that (without goading, anyway), and it'd be better if folks could better master probability intuitively. I like the conceptstew.co.uk page's rundown on Prosecutor's Fallacy (using a hypothetical criminal case) so much, I'm going to summarise it below: A purse-snatcher in London absconded with quite a lot of cash, but the victim gave a detailed description of the thief that included several distinctive physical traits. A suspect got picked up the next day who matched all those traits. He was arraigned. There was no other physical evidence. At trial, the Crown argued population probabilities, quizzing as expert witness a government statistician: 'Being male is 0.51 likely. Being 2 metres tall is 0.025. Being between 20 and 30 years old is 0.25. Being red-headed is 0.037. Having a pronounced limp is 0.017.' Likelihood of all these independent traits at once is 0.51 x 0.025 x 0.25 x 0.037 x 0.017 = 0.000002 -- or about one in half a million. 'The chance of any random individual sharing all these characteristics is vanishingly small - only 0.000002. The prisoner has them all.' So, he argued, the chance of him being innocent is infinitesimal, and he should be convicted. Defence counsel took the stand, and cross-examined the expert witness: 'What is the population of London?' He looked startled, but replied 'I think it is about 10 million.' 'So based on your statistics, how many people in London have this set of characteristics?' He blustered a bit, but was forced to admit that there should be 20. 'Given that your evidence is based solely on a description and that you have admitted that there are 20 people in London who fit this description, this must mean that the probability of my client's guilt is very small, only only 1 in 20. Or to put it another way, the chance of his innocence is 19 in 20, not 1 in half-a-million.' Page concludes: The expert witness [during direct testimony] confused two things: o the probability of an individual matching a description, and o the probability of an individual who _does_ match that description being guilty They are not the same! It is easier to see the fallacy as soon as the probability of 0.000002 is turned into numbers of real people: When you bear in mind that the population of possible suspects is 10 million, 1 in half a million easily translates into 20 possible suspects - the accused is only one of this group, and, if we are to be convinced of his guilt with no other evidence, we would want to know that the other 19 had been excluded. And that is without even considering people who might have come up to London for the day!

3357

Age (days ago)

3357

Last active (days ago)

List overview

Download

4 comments

3 participants

participants (3)

Rick Moen
Rodney Brown
Russell Coker