July 28, 2011 -- Want to detect if online hotel reviews are legit? Brush up on your parts of speech.
Software developed by Cornell University researchers found that if an online hotel review uses a lot of nouns and is heavy on specifics like the size of the bathroom, it's probably genuine. If it is verb-packed and raves in general about how great everything was, be wary.
The researchers had 400 separate reviewers each write one fake review of a leading Chicago hotel. They took another 400 actual reviews of 20 leading Chicago hotels from TripAdvisor, the top travel site.
Then three human judges -- Cornell undergraduates -- were each given 160 reviews to evaluate and decide whether the reviews were real or fake. Two were right about half the time, while the most astute of the three was 62 percent accurate.
The software was close to 90 percent accurate at nailing the fakers.
"We didn't expect to be able to do that much better than humans," said Myle Ott, 22, a Cornell Ph.D. student in computer science and co-author of the study along with Claire Cardie, a professor of computer science; Jeff Hancock, a communication professor; and Yejin Choi. They presented their findings last month at a meeting of the Association for Computational Linguistics.
The software the researchers devised was able to pick out the phony reviews partly by analyzing parts of speech.
"We found that the deceptive reviews generally use more verbs, adverbs and pronouns," said Ott, while "truthful ones use more nouns, prepositions and adjectives."
Basically, the real reviewers talked about features they actually saw in the hotel, like the room size or the checkout speed. The fakers talked about external things, like going to Chicago with their spouses and having a fabulous time.
For example, an actual review of the Affinia in Chicago cited in the study said: "The recently remodeled Affina [sic] was amazing -- from the 6 choice pillow menu to the stocked 'pantry and refrigerator' to the brand name bathroom amenities."
The fake one reads: "I took my family to the Affinia Chicago for a short vacation last summer and it was fabulous. My kids say that their favorite part was the pool and the fact that the amusement park is really close by. My wife was really stoked with Spaffina."
Punctuation was also a giveaway. Reviews that had quite a bit of punctuation, especially dollar signs, dashes, parentheses and ellipses, tended to be real.
TripAdvisor has its own fake-detection techniques, according to spokeswoman Karen Drake.
She said in an email that reviews are "systematically screened by our proprietary site tools" as well as by "our large and passionate community of more than 45 million monthly visitors" who also help report suspicious content. The site also uses quality assurance specialists.
Ott hopes the software can have wider uses.
"We're looking into how we might extend these techniques to restaurant and product reviews," Ott said.