Data, Technologies and Security - Part 2

Around 5 months ago we published a blogpost titled "Data, Technologies and Security - Part 1", this week we decided to look at part 2 and considered publishing it about some other technologies that we have analyzed.

However, we got a request asking if we were still doing work around the databases of the first blogpost and instead of posting the part 2 about new technologies we decided to re-scan what we had previously scanned and compare the results from Part 1 with the new ones.

Again, like the first post which technologies did we look at?

One interesting fact is that scanning the internet is not deterministic, you don't control the entire chain and weird things can happen. Sometimes some machines might reply to your scans sometimes they might not; sometimes machines are online, other times they are not. Then we have dynamic ips which make things even more interesting.

Over the last 5 months BinaryEdge has focused on improving the quality of our 40FY platform and the data we collect and pass to our clients. Our BETA clients were extremely valuable and gave us a lot of feedback which allowed to us to bring the platform to its final and launch state in January.

That quality increase means we got better data this time around, we have also improved our scanning algorithms and that shows in this blogpost.

So without further ado...

Redis####

Redis was a very interesting case. We noticed a huge change on our Redis scans. Initially we even considered there was some problem with our sensors but then we started digging further and found something interesting.

On the initial scan we had found 35,330 instances accessible, however this time around we found only 17,482 - by accessible we mean instances that had the port open and replied to our query of status.

In terms of memory, the current memory values went from 13.2133 TB to a mere 1.82 TB being available, in terms of the peak memory we went from having 17.0801 TB to 3.196 TB.

We found this difference extremely weird and by looking at our data (we can easily create diffs between different scans) we noticed that many of the found instances were now replying with

{"result_type":"redis","job_id":"XXX-cb123fea-5922-4590-b405-f7f2229138ac",
"client_id":"XXX","provider":"min-05-11815-usnj-dev","origin":"grabber",
"src":{"ip":"XXX.XXX.XXX.XXX","port":6379},
"result":{"error":"Ready check failed: NOAUTH Authentication required."},
"ts":1452775516359,"@timestamp":"2016-01-14T12:45:16.437Z"}

Googling a bit and looking at the changelog from REDIS over the last months we didn't see a change that could affect this; however we did notice an increase in people complaining about this error suddenly appearing on their REDIS instances.

What apparently happened is that someone scanned for REDIS instances on the internet and activated authentication on most of them.

In terms of version changes, a lot happened as well:

Top 10 versions

MongoDB####

On the first blogpost we had found 39,134 MongoDB Servers instances that answered our requests and that didn't have any type of authentication with a total of 619,803,378,799,485 bytes exposed (or 619.803 Terabytes for short). We had also found 7,267 instances asking for authentication.

So what changed on the new scan?

For starters, we found 39,805 instances exposed, this number went up by 671. In terms of data, on this scan we found exposed 781,651,772,749,302 bytes (or 710.908 Terabytes for short)

The geographical distribution of the new scan looks as follow:

Worldwide Spread

Another interesting comparison between our old and new scan is the versions:

Jan 2016 vs Sep 2015

Interesting as well is the comparison on the databases names between the old scan and the new one

Jan 2016 vs Sep 2015

It is great to see that some of the "DELETED_BECAUSE_YOU_DIDNT_PASSWORD_PROTECT_YOUR_MONGODB" have disappeared: this number went from 347 to 143.

Memcache####

For memcache on the initial scan we found 118,574 instances and this time around we found 171,148. In the previous scan we had found 11,347,052,520,140 bytes exposed (11.347 Terabytes) this time this number increased to 13,224,426,616,173 (12.028 Terabytes).

The world wide distribution of the new version looks like this:

Worldwide Spread

We also noticed some changes on the versions of Memcache that we found:

Jan 2016 vs Sep 2015

ElasticSearch####

For our final technology we present our findings on ElasticSearch. Our previous scan had detected 531,199 Terabytes of data across 8,990 instances. This time around we found 9,597 instances exposing a total of 137,080 Terabytes. A major decrease was seen here, the reasons for this was that there was a couple of major servers that had a lot of data exposed which have now been fixed.

Conclusion####

So 5 months later, how have things changed? Are they any better?

It's an hard measurement. The situation with REDIS surely paints a worrying picture and affects our data. The same way someone went and executed that lockdown across the REDIS servers that were exposed, other actions might have been executed. We recommend that everyone that has a REDIS exposed to the internet has a look at their logs.

It is good to see some of the ElasticSearch servers have been fixed, however it's saddening to see the number of exposed instances are increasing not only for ES but also for Memcached and MongoDB.

If you would like to do your own scans, monitor your organisation and obtain scanning data have a look at our platform 40fy.io.

If you want to keep up to date with our analysis and posts please consider following us on twitter, google+ and facebook..