A Drop of Water

Sunday, November 2, 2014

On Academic Presentation

(v0.1)

I will present our paper at a CCS workshop next Friday. Then I will present my thesis proposal in the comprehensive exam next next Friday. Facing these two important occasions, I decide to summarize my current understanding on presentation. This is NOT a collection of advises, because I am far from a good academic speaker. I simply hope this article may raise some discussions and help you think about what will lead to a good academic presentation.

Here I have several points to share:

(1) A clear story flow in the presentation is of top priority. The flow can grasp the attention of the audience. As others have said [1], the flow is much more important in slides than in paper, because the audio channel is more brittle. In addition, a good flow will also help the presenter to remember what to say.

I think there are at least two types of flows:

The logic flow of research. The audiences should know the natural transition between research steps. Thus they will appreciate the work.
The knowledge flow. We need to introduce enough background before going into details. Also, make sure that terms etc. are understandable.

In addition, try to only have one story line. It is true that a research project usually expands to several branches. But they will interrupt the flow and confuse the listeners.

(2) Presentation is a process of convincing others. The listeners will be convinced if the study is rigorous and the language is accurate. Do not over claim.

(3) Make the presentation tight. Try to connect things together. Try to refer back to previous important points. This actually improves the complexity of the presentation structure, and people enjoys complexity. Similar strategies have often being used in movies. Lock, Stock and Two Smoking Barrels is a perfect example.

(4) Presentation is also a form of teaching. Try to think what the audience will learn from it.

(5) We have our own styles in presentation. I feel it is in general hard to copy other's style. For example, native-English-speakers can talk about jokes and funny pictures (e.g. the one used in this blog), which are sometime hard to understand, not to mention to speak, by non-native speakers. Nonetheless, even without these funny elements one can still make a good talk. I sometime think too much "fun" will actually have negative effects, i.e., "amuse to death".

Here are general steps I take for preparing a presentation. Please feel free to comment on them and provide your own opinions:

(1). Have a rough story line first.

(2). Turn the story line into slides. Focus more on the completeness of the information.

(3). Practice lightly and then update the slides. At this stage don't expect them to be perfect. Also take a look at similar talks to "steal" good presentation ideas.

(4). Write the scripts for all slides. At least write outlines for each slide. You don't need to read them, but you need them to remind you about the story line. Also, written text is easy to be studied and improved.

(5). Practice seriously.

(6). Present to others. Your adviser or research collaborators are the best choices. They know your research, but they are not trapped by myriad of details like you. So they can give very good suggestions on improving the story line! People with enough knowledge background (e.g. your lab mates) are also good. They can tell you which part is unclear or confusing. Also, try to collect creative ideas of presentation from others.

(7). Improve slides, practice, improve slides, ....

In general, you will feel unconfident and uncomfortable in the beginning, because the quality of your taste is always ahead of the quality of your work [2]. However, as long as you keep improving it, the final version will be very good. Furthermore after a well preparation, you will not only have a great talk, but also find new research ideas!

References:

[1] 博士五年总结（三）, http://blog.sina.com.cn/s/blog_946b64360101dych.html

[2] Ira Glass on Storytelling, http://vimeo.com/24715531

[3] The picture. http://assets.diylol.com/hfs/ae1/38e/525/resized/business-cat-meme-generator-boss-wished-me-luck-on-the-presentation-like-i-need-it-52c717.jpg

Saturday, October 11, 2014

English Name or Not?

(v0.1)

As a Chinese student in America, an important question to ask is: should I choose an English (first) name? Those who against this idea usually provide the following points:

The original name defines your identity.
You should respect the original name because it is given by your parents.
If I am good, others will correctly pronounce and remember my name anyway.

Some of my American friends, Indian friends and Chinese friends are holding these points. Sometime ago, I've also watched a Youtube video in which an American student advocates these points to some Taiwan students.

Other people, such as Philip Guo [2], support the idea of choosing an English name when moving to an English-speaking country.

And here is my opinion: although I currently do not have an English name, I agree that Chinese students (or possibly other East Asian students) studying in America should find a English (first) name. Obviously, the English name is easier to pronounce and to remember by both the natives and students from other countries. The English name can also tell the person's gender, which in some situations are more convenient. I guess the reason that most Indian students don't choose an English name because their original name is relatively easy to pronounce and already tells the gender, at least based on my experience. After all, English and Hindi both belong to the family of Indo-European languages.

Furthermore, I disagree with the three points that are against finding an English name. To refute them, we can look at the opposite direction: what did some Westerners do when they were in China. During the Age of Discovery, many Jesuit priests came to China and played an important role in the communication between civilizations. These priests all used Chinese names, such as 利玛窦 (Matteo Ricci, the man in the above figure)，汤若望，郎世宁, which are still known by many Chinese today.

Also, having a second name is actually a part of traditional Chinese culture. Ancient Chinese people use their style name (字), rather than their real name in the daily lives. And it is actually impolite to call one using the real name. It is not a bad idea to consider the English name as a style name.

References

[1] The picture, http://www.faculty.fairfield.edu/jmac/sj/scientists/riccimap.gif

[2] http://www.pgbovine.net/choosing-english-name.htm

Saturday, September 20, 2014

A Quick Analysis of Facebook Bug Bounty Program

(v2, updated 10/15/2014)

Nowadays, Web companies have been relying on vulnerability reward programs (VRP, also called bug bounty programs) to discover vulnerabilities in their products. Basically, a white hat (good hacker) can submit a vulnerability discovery report and then get some money back. We have written a preliminary paper analyzing a related program called Wooyun, and please take a look if you are in general interested in this new paradigm of improving security.

Facebook is one of the companies that embrace this idea, although Facebook is generous sometime (see this and this), :P. FB also hides information about what vulnerabilities have been discovered, or the details of each white hat's accomplishment (e.g. how many vulnerabilities one has discovered, and when). FB only provides a list of white hats who have contributed to Facebook security every year, at this page.

Anyway, we can start with this page and do some quick analysis. The data is obtained by 9/20/2014. First, there are 670 names on the list (there are several cases when multiple names appear in one line and separated by commas, and we will count each name alone). Quite a lot, isn't it? But it is possible that some enthusiastic white hats contributed every year and leave their name multiple times, so we also count the number of unique names, which is 516.

Next, we count the number of white hats each year, shown in the following table:

Time	White Hat Count
2014 (up to 9.20)	191
2013	255
2012	126
2011	55
Prior to 2011	43

We clearly see the trend: more and more players are joining this game, and the number roughly doubles every year:) I guess VRP is really a promising idea (please see our paper for more discussions).

There is also an interesting fact: a lot of white hats are only active in one year. To show this, we create another table counting the white hats based on number of years being active:

Number Years being Active	White Hat Count
1	402
2	82
3	26
4	5
>=5	1

So far, there are 402 who have only appeared in one year's thank list. And we can see that the white hat count distribution is highly skewed. Much few white hats are active for more than one year. And there is only one person who has been thanked all the time! This probably shows that the value of this kind of VRP not only lies in a few experts, but also in a large number of people. But since we don't know how many vulnerabilities each white hat contributes and the severity of them, the conclusion is hard to make. Still, this observation is consistent with what we claim in our paper.

You might wonder who is the "all the time" person, and the answer is: Szymon Gruszecki. You can access his personal page here.

Please feel free to discuss by leaving a message. Thank you for your time!

Update:

Facebook has released some interesting statistics of its bounty program here:
https://www.facebook.com/notes/facebook-bug-bounty/bug-bounty-highlights-and-updates/818902394790655

Some interesting points:

From the statics we see that there is a huge number of invalid reports. The valid rate is only 4.7%. Why?
It says that "One of the most encouraging trends we've observed is that repeat submitters usually improve over time. It's not uncommon for a researcher who has submitted non-security or low-severity issues to later find valuable bugs that lead to higher rewards." Actually, we plan to investigate this issue further in our data set.
The country rank: Russia -> India -> USA -> Brazil -> UK

References

[1] The picture. http://america.aljazeera.com/content/dam/ajam/images/shows/Real%20Money%20with%20Ali%20Velshi/SG_FB2_1460.jpg

Saturday, August 23, 2014

Bugs and Patches for Papers

In an earlier article, Writing Like Compiling, I have made some connections between programs and papers. This article makes a connection from a different perspective.

For publications in Computer Science (and possibly other domains), there is a problem. A paper could contain bugs: errors, unclear sentences, missing backgrounds, etc. These bugs might caused by the knowledge gap between the authors and the readers. Or they simply arise due to the conference-driven publication paradigm. Such paradigm puts researchers on a fast race and leave them less time to ponder and polish their work [2]. These bugs inflict readers minds and eats up their time. Some smart readers might find ways to fix those bugs, just like an advanced user finds a bug in a program and makes a patch for it. However, since there is not good way to share the fix, and a paper is usually fixed after the camera-ready version, this paper patches only stay in a paper copy as some red marks...

Programs, too, are not perfect after release. However, software developers and users will constantly discover new bugs and apply corresponding patches. And this model generally works well. After all, there seems to be no alternative way. Therefore, I think we need to treat papers as programs, and creates ways for reporting bugs and sharing patches. A first step is to store these paper bugs and patches in some database and enable readers to search for them. However, we want to avoid the detachment between the papers and the patches, so we could allow some energetic readers to fork a paper and make version 2.0 of that paper. This fork ability is very common in the opensource software community [3].

In general, isn't it a bit ironic that Computer Science, the field that aims to digitalize paper-based information, still record its cutting-edge findings in papers?

References

[1] The picture. http://www.rebeccaheflin.com/wordpress/wp-content/uploads/2013/08/rejected-writing.jpg

[2] Fortnow, Lance. "Viewpoint Time for computer science to grow up." Communications of the ACM 52.8 (2009): 33-35.

[3] Raymond, Eric. "The cathedral and the bazaar." Knowledge, Technology & Policy 12.3 (1999): 23-49.

Saturday, July 12, 2014

Two Downsides of Privacy

v0.2

These days people are all talking about privacy, partly thanks to Mr. Snowden's effort. While I definitely support the the individual right of privacy, I want to talk about two downsides of privacy here. But again, I strongly agree that each individual should have full control of her or his information. I just think sometime we might want to trade privacy for more important things.

The first downside is that privacy could lead to distrust. An often used example for supporting privacy is: you got drunk one night and shared a photo of drinking on Facebook. While your friends might like it, your boss doesn't. So people have been designing advanced access control technologies to keep your boss away from your "little secrets". Some people might even avoid using Facebook, considering that some companies require employees' Facebook passwords [2]. However, such personal information disclosure could help others understand you more and thus enhance the relationship. On the other hand, if a person cannot be traced at all on the Internet, he or she will be a mystery in others' eyes. And will you trust a mysterious figure?

The second downside is that privacy might obliviate a person. From East to West, from past to present, an eternal pursue of the mankind is immortality. At least, a person wants to leave something to this world after death, which could only be achieved by a small group of people in the past. This digital age enables immortality to everyone, in the sense that their words and activities on the public Internet can be recorded and kept almost forever. Search engine could retrieve one's words and return to somebody in the future, and we could imagine it as a kind of conversation between the dead and the live. However, privacy would make these information secret to only a few people or even one person, through technologies like encryption. If that person happened to pass away, then his words also gone with him, if no one else knows the password. What a pity if Einstein II encrypted his remarkable theory and then passed away accidentally. Would it be better to publish it on a blog, like this blog?

Update: we all feel really sad about the MH17 tragedy. It has been said that there are more than 100 AIDS researchers on board. And I hope we can rescue their ideas and thoughts as much as possible.

References:

[1] The picture. http://blog.static.abine.com/blog/wp-content/uploads/2011/10/privacy.jpg?e835a1

[2] http://www.usatoday.com/story/money/business/2014/01/10/facebook-passwords-employers/4327739/

Friday, June 27, 2014

Heartbleed and the Paradox of Security Professionals

The recent Heartbleed vulnerability of OpenSSL shaken the whole Internet. Yet such incident might not be a total surprise because OpenSSL only had one full time employee and received $2000 a year as a donation [1] before the incident. Such support is by no means enough for the developers to produce high quality code and test the software product comprehensively. I guess the even hackers who keep searching for vulnerabilities inside OpenSSL have much more funding.

But the situation is changed now, as tech giants agreed to fund OpenSSL for at least 3.9 millions in three years. Even a Chinese mobile company, Smartisan, announced a donation of 1 million yuan ($160,000) [2]. You might already feel the strangeness of this event: a mistake makes millions of money!

The more astonishing thought is that if OpenSSL developers did a better job by not introducing the vulnerability, then they will still starving and suffering in poverty! This paradox does not only apply to OpenSSL, but probably to every company that needs security. For such a company, if the security team is doing a good job, then the company's CEO might feel the security team is redundant because nothing bad happens. Although the CEO might not that dumb to fire the security team, nonetheless the CEO could not appreciate the effort of the security team and might not raise their salary. Thus for the security team, there is hardly any incentives to do better. Rather, they might just want to meet the minimum requirements, or even accept some security incidents to attract the attention from company managers.

Do you know how to break this paradox?

References:

[1] Tech giants, chastened by Heartbleed, finally agree to fund OpenSSL. http://arstechnica.com/information-technology/2014/04/tech-giants-chastened-by-heartbleed-finally-agree-to-fund-openssl/

[2] http://www.ithome.com/html/android/86232.htm

[3] The picture. http://www.paradoxproductions.com/pics/tritwo.gif

Tuesday, May 20, 2014

Research vs. Learning

A PhD student has two roles. One is a research assistant that strives to produce good research papers. Another is a learner that keeps improving oneself. Actually, even after obtaining PhD, a researcher still need to learn, and might need to do so for one's lifetime. Research result is explicit while learning result is implicit. That why some students and faculties (e.g. [1]) ignore the later role. However, I would like to use a simple model to show that such ignorance leads to inefficiency.

Let's consider making research progress as a random sampling from a normal distribution, which represents the capability of the PhD student. And higher value of the sample corresponds to higher quality of the research. No matter how many samples are drawn, the average of the samples is the mean of the distribution. And only a few of them could have high value.

Now, student A and B both start with a normal distribution of mean = 1 and variance = 3. And we further assume that sample x > 5 means a very good paper and x < 0 means a failed research project. The figure is:

Now, A decide to solely focus on research. So A keep sampling from this distribution. However, the probability to produce a good pare is only 1.0%. So 100 trials lead to one good paper. And the chance of failure is 28.2%, so more than 1/4 of the trials result in failure.

On the other hand, B focus on learning, by which B pushed the mean of the distribution to 3:

For B, the probability of generating a good paper is 12.4%, which is better than A's in an order of magnitude. Moreover, the probability of failure now drops to 2.8%.

I don't want to draw any conclusion because it is just a very rough model. However, I think you can see the point. And I am also not saying that a PhD student should only focus on learning without any research responsibility. I personally think a PhD student should definitely spend more time on research than on learning. And doing research is actually another very important way of learning, that's why I put the Taiji graph in the beginning of this article, because they boost each other. The point I want to make is that during the journey, sometime there will be a stagnant period during research and we might feel sad. However, we should smile because as long as we keep learning and keep pushing the mean of our normal distribution to the right, things will be fine:)

References:

[1] The picture. http://www.acuherb.us/image/taiji01.png

[2] http://blog.liyiwei.org/?p=1429