Join us for a virtual meetup on Zoom at 8 PM, July 31 (PDT) about using One Time Series Database for Both Metrics and Logs 👉🏻 Register Now

Skip to content
On this page
Engineering
April 07, 2023

What are the most pressed keys by programmers? We figure it out with GreptimeDB

GreptimeDB v0.1 has been released, providing an initial but fairly dependable standalone version that anyone can use. As programmers, we barely face industrial scenarios that generate huge amounts of data in our daily work. Then how can we appreciate the advantages of time-series database in solving practical problems? Our engineering colleagues have created a straightforward script along with various queries and GreptimeDB to analyze keyboard usage behavior, and have achieved many intriguing findings.

GreptimeDB has recently released version 0.1, which, despite being in the early stages, is already a dependable and useful database that can be applied in practical scenarios.

Time-series data is primarily generated in the Internet of Things and monitoring fields, which is typically extensive and high-frequency. If you're not familiar with time-series data and time-series database, we suggest that you read this article first.

Where can programmers, who spend their days sitting at a desk and typing on a keyboard, find scenarios that produce high-frequency and colossal data?

Wait a minute, typing on the keyboard seems to be generating time-series data. I'm intrigued to know how frequently I press keys in a day, and which specific keys are used more often than others. There's a common joke that programmers rely heavily on the ctrl+C and ctrl+V shortcuts, but how often are they actually used?

With GreptimeDB v0.1, all we need to do is create a script and utilize GreptimeDB to find the answer.

Download and connect to database

The first step is to install GreptimeDB locally. We have just recently put the download page online, and you can download the compiled binary version from the website and start a standalone version using greptime standalone start. As expected, you should see:

image1

We use the HTTP interface on port 4000 to make requests to the database, and you can just send SQL statements. See the syntax here.

Write scripts to record information

For example, if we type curl http://localhost:4000/v1/sql -d "sql=CREATE TABLE keymaster(key STRING, ts TIMESTAMP TIME INDEX)" on the command line, a new table can then be created to record keystrokes information.

The next is how to record the information, and here I'll use Node.js, which I'm familiar with, to solve it.

Javascript
const axios = require('axios')
const qs = require('qs')
const { GlobalKeyboardListener } = require('node-global-key-listener')
const v = new GlobalKeyboardListener()
//Log every key that's pressed.
v.addListener(function (e, down) {
  if (e.state === 'DOWN') {
    let metaKey = Object.entries(down)
      .filter(([key, value]) => value === true && key.length > 1 && !/LOCK/.test(key))
      .map(([key, value]) => key)

    if (e.name.length === 1) metaKey.push(e.name)
    metaKey = metaKey.join('+')
    saveKeypressEvent(metaKey)
    console.log(${metaKey})
  }
})

const saveKeypressEvent = async (metaKey) => {
  try {
    const response = await axios.post(
      'http://127.0.0.1:4000/v1/sql?db=public',
      qs.stringify({
        sql: INSERT INTO keymaster(key, ts) VALUES('${metaKey}', ${new Date().valueOf()}),
      })
    )
  } catch (error) {
    console.log(error)
  }
}

I have extracted the multi-key cases and used them to count the frequency of all keys combinations.

Query analysis

Then I used xbar to run a timed script to output the query results, but you can surely run the query in any other way you want.

Like using

sql
SELECT key , COUNT(*) as times 
FROM keymaster 
WHERE LENGTH(key) < 2 
GROUP BY key 
ORDER BY times 
DESC limit 10

to query the 10 most clicked single-letter keys;

Using

sql
SELECT key , COUNT(*) as times 
FROM keymaster 
WHERE key like '%+%' 
GROUP BY key 
ORDER BY times 
DESC limit 10

to query the 10 most used key combinations.

Data results

This is the cumulative data after more than a week of recording, from the results of which many interesting conclusions can be drawn:

image2

  • The average number of keystrokes per day is about 10,000
  • Backspace takes up about 1/10 in total
  • The most used letter is i, which is almost 20% more than the second most used one, e. This is probably because I use vim everyday
  • I use cmd+w a lot to close windows, because I am not a tab hoarder
  • Although auto save is on, cmd+s seems to have been deep in my muscle memory
  • cmd+V is almost 20% more than cmd+C since there is at least one paste after copying.
  • Copy and paste is really a key combination that programmers use frequently

You may notice in the previous screenshot that there is also an APM (Actions Per Minute), which is the data that I calculated in time by using the input time and then stored in another table.

Visualising the results

You can type http://localhost:4000/dashboard in your browser. Open the console and type SELECT * FROM apm ORDER BY ts DESC LIMIT 1800 to see the data chart.

The Dashboard is another open sourced project created by the Greptime team to help users use GreptimeDB more easily. Dashboard has been integrated into the GreptimeDB distribution and can be started locally after running the database.

image3

Here is the average number of entries per minute (APM) for the last 30 minutes, which is not bad for my speed.

You can't see this data locally unless you have created the APM table yet, I believe it should be easy for you to do this by writing a statistics script after reading the previous content.

If you have any problems with using GreptimeDB and Dashboard, or have suggestions, please feel free to raise an issue or pr for our project.

What I mention in this article is a very simple example that only utilizes a small part of GreptimeDB's many capabilities, while the amount of data is so small that GreptimeDB can easily make it. Even if ten people roll the keyboards together with their faces, GreptimeDB would handle it easily.

So if you have to deal with large amounts of data, you should definitely try GreptimeDB.

Join our community

Get the latest updates and discuss with other users.