The Duo Security is the renowned unified access security provider and multi-factor authentication company. Yesterday, they published the technical research and methodology of their three-month research project. This report mentioned the ways to find out the automated Twitter accounts (bots) on a mass scale.
They held a conversation at Black Hat USA on “Don’t @ Me: Hunting Twitter Bots at Scale”; after which the company released the result. The technical paper describes the detail including the following:
- How to gather the dataset?
- The scientific approach to data analyzation
- How to build a classifier to identify bots
- How to identify botnets, including a spam-spreading botnet case study
The Duo Security Analysis Report
Starting from May to July 2018, the researchers analyzed and gathered public Twitter accounts of 88 million customers. The accounts comprise half-a-billion tweets, which is one of the biggest random studied datasets on Twitter to date.
The dataset of Duo is build using information gathering via Twitters’ publicly available API. This includes profile name, avatar, followers and following counts, tweet count, and personal bio. In addition to this, the content of tweets with overall social network connections of the accounts has also collected a part of information.
The Principal R&D Engineer at Duo, Jordan Wright together with the Data Scientist, Olabode Anise are going to present their research “Don’t @ Me: Hunting Twitter Bots at Scale” on 8th August around 2:40 p.m. PDT at the “Black Hat USA 2018” security conference held in Las Vegas.
Look out for PR service? Send us the mail today at info@coinpedia.
After the presentation, Wright and Anise will release their research tools on Github to help other researchers recognize the automated Twitter accounts at scale.
The data scientist at Duo, Olabode Anise says:
“Users are likely to trust a tweet more or less depending on how many times it’s been retweeted or liked. Those behind this particular botnet know this, and have designed it to exploit this very tendency. The bots’ attempts to thwart detection demonstrate the importance of analyzing an account holistically, including the metadata around the content. For example, bot accounts will typically tweet in short bursts, causing the average time between tweets to be very low. Documenting these patterns of behavior can also identify other malicious and spam botnets.”
Implementing machine learning algorithms to identify the bot accounts using their dataset, the researchers at Duo Labs also untangled an enlightened cryptocurrency scam botnet. This scam comprises at least 15,000 bots.
This explained how it siphons funds from unsuspicious users by spoofing the celebrities, cryptocurrency exchanges, verified accounts, news organizations, and many more. The planning behind accounts in the cryptocurrency scam botnet was to deploy deceptive nature and behave genuine, evading the automatic detection.
The analysis also described the application of 20 accounts with unique characteristics within a machine learning model which distinguishes a human Twitter account, behaving as “genuine” according to the study, from an automatic bot. Some characteristics include:
- Time between tweets
- Different tweet sources
- Average number of hours daily an account is active
Highlights of the Research
New variant open-source tools together with techniques are available. This part helps to locate unravel large-scale botnets.
Analysis of biggest random Twitter datasets to date, with the finding of 20 applications with unique account characteristics.
Discovery of the uncovered cryptocurrency scam botnet which has at least 15,000 bots.
Finding the three-tier and hierarchical structure cryptocurrency scam botnet. This contains the scam publishing bots. This bot was also artificially inflating the popularity of tweets and legitimating the scam link.
In response to this research a Twitter spokesperson said:
“Twitter is aware of this form of manipulation and is proactively implementing a number of detections to prevent these types of accounts from engaging with others in a deceptive manner. Spam and certain forms of automation are against Twitter’s rules. In many cases, spammy content is hidden on Twitter on the basis of automated detections. When spammy content is hidden on Twitter from areas, like search and conversations, that may not affect its availability via the API. This means certain types of spam may be visible via Twitter’s API even if it is not visible on Twitter itself. Less than 5% of Twitter accounts are spam.”
You can find out the details about the research on the Duo blog; also the details that will be presented at Black Hat USA.