The Processing IDE continues to impress me with its visual capabilities and intuitive functions. In fact, it surprised me to find that I could program a side-scroller – a hitherto complex interface – within 90 lines of code, and decided to write about it.

The term ‘side-scroller’ now encompasses all videogames that utilize a side-view angle of the camera in the gameplay. The common element in all of these games is a display that scrolls in response to the player’s input.

This month’s discussion is dedicated to BikeRoller – an endlessly scrolling sketch. Here, the player controls the biker’s direction (along with the background behind it) using the LEFT and RIGHT arrow keys. 

A brief history  

Side-scrollers have been in existence since the early 1970s, with Speed Race(1974) being the first game to use a scrolling display, albeit a vertical one. This technological feat was achieved Tomohiro Nishikado, the game’s designer, who also incorporated sprites and collision detection in this racing arcade game.

The most iconic game title in this genre is Super Mario Bros., a platformer game that was developed and published by Nintendo for the NES(Nintendo Entertainment System) platform. Here, the player controls the protagonist Mario, who must race through all the stages in Mushroom Kingdom to save Princess Peach(US version: Princess Toadstool), while smashing bricks, collecting coins and power-ups, and defeating the antagonist Goombas(mushrooms) and Koopa Troopas(turtles).

This game has laid the foundation for subsequent(as well as modern) videogames, whether side-scrolling or not, by incorporating secret levels to discover, power-ups to collect, and enemies to defeat – all within a fixed time frame and limited number of lives.

While side-scrollers have given way to FPSes(first person shooters) in the past two decades, they continue to be popular on smart-phones and other handheld devices.

Scrolling the display

The display may be scrolled in several different ways – horizontally, vertically, or a combination of both. In this sketch, the background elements(the road, the river, and the sky) move in response to the arrow keys pressed by the user. Hence, the biker(controlled by the player) appears to move due to the foreground and background elements moving behind it.

In order to give the player a sense of depth, a parallax effect has been created by moving foreground and background elements at different speeds, giving a sense of depth.  

Though the sketch is quite basic, it helps to illustrate the basic working of a side-scroller. Currently using only LEFT and RIGHT arrow keys as its input, its scope may be expanded with more keyboard and mouse controls.

Technical explanation

As can be seen from the algorithm, three inputs are required to execute the scrolling loop: 

1. img – an image file of the desired element, 

2. step – an increment value by which the element is moved or scrolled, and

3. keyCode – the value of the pressed key. 

Apart from these, two more integers are declared within this function, which are as follows:

1. x – the x-position of the element on the screen. It is used in the incrementing process (updating the rendered element’s position)

2. dir – used to store either the value of -1 or 1, depending on the value of keyCode.

In the first run, the scrolling function will render the element by positioning one copy of it at (0,0) , (-imgWd, 0) and (imgWd, 0).[Note: imgWd is the pixel width of the element.]

After this, x is incremented by a value of -dir*step, followed by the value of dir being checked. 

From here on, depending upon the conditions mentioned in the algorithm, the flow of control will move either towards termination of the loop, or continuation of the same set of steps detailed above.

Hopefully, you have now learnt a little more about the technicalities of a scrolling background.

Related Links        

1. BikeRoller; the project on GitHub:          

2. Side Scrollers: A Planar Odyssey; a historical narrative:

3. Forerunners : The History Of The PC Side-Scroller; a documentary:

4. Endless Runner Games: Evolution and Future; an article:


Processing Visuals

Of late, I have been tinkering around with a new, Java-based IDE (integrated development environment) – Processing 3. Also offered as Python and JavaScript libraries (as and p5.js respectively), Processing is an open-source initiative by Benjamin Fry and Casey Reas to automate visuals.

Having seen several output demos and YouTube tutorials about this platform, I decided to try it out myself. After downloading it, and going through several Processing sketches – both on GitHub and elsewhere – I came up with a sketch of my own. Here is its GIF:

Hypnotic-Spiral (my first Processing sketch)

The Interface

Several factors have influenced my latest IDE choice, one of them being the sheer abundance of tutorials and sketch ideas on the Internet. This, along with its simple sketchbook interface, enhances its usability.

What really encouraged me to install Processing in the first place was its integration with Arduino’s IDE. For quite some time, I had been looking for ways to create some visually useful programs that would interface with my Arduino board. To be more specific, I wanted better control of the output received by the Morse Code machine I built using this tutorial:

I was pleasantly surprised to find that the Arduino sketchbook was inspired by Processing’s interface itself. Here is a side-by-side comparison of both –

Processing and Arduino – note the similarities.

Processing – which has inspired several more projects, apart from Arduino – is rightfully credited to the painstaking efforts put in by its community of developers. Hats off to them!

Graphics Programming – Not a New Venture

For me, at least, it’s not a new thing. In fact, the sole reason I picked Computer Science as my optional subject in high school was to understand computer graphics better. At that time, I had read ‘Masters of Doom’ by David Kushner, and was particularly interested in the programming of Doom – a game that paved the way for FPSes(first person shooters) in the DOS era, partly due to the revolutionizing effect of its (pseudo) 3D graphics.

The Computer Science classes were quite useful, since they taught me about the basics of programming in Turbo C++ – flow of control, classes, constructors and destructors, pointers, arrays, read – write sequences etc. To my dismay, all the graphic functions for this IDE were stowed away in the elusive <graphics.h> file, which was never invoked even once in our lessons.

From there, I embarked on a solo mission to educate myself about the same. Equipped with a book on Borland Graphics that I found in my school library, I installed the required software – Turbo C++ 4.5(the IDE), DOSBox(the DOS emulator), and taught myself C code – a process that took me at least two months.

Having experienced this, I find Processing a much simpler IDE, both in installation and usage. In fact, I won’t be surprised if Processing becomes the de-facto for programming in schools and pre-university courses, in the coming years.

I look forward to creating more Processing and Arduino sketches in the future.

External Links

  1. Processing (programming language); a Wikipedia article:

  1. Download \; the download page:

  1. Hypnotic-Spiral; the GitHub repo:

PDF Conversions – Today’s Necessity

Being a college student, I often find myself at the print shop, carrying with me all kinds of documents to be printed – fee slips, academic transcripts, scanned copies of handwritten notes etc. While apps like CamScanner help in creating PDF copies of class notes, their functionality is limited to images that are directly captured by the app(s). Furthermore, the only file formats recognized at the print shop are JPEG, DOC(X) and PDF.

That’s why I have been scouring Google’s Play Store – in the pursuit of an app that can convert all my files to PDF copies, and into other formats, as and when required. One such app that fits the bill is PDF Convertor, developed by Cometdocs.

Before reviewing the app, it is imperative to expound a little on the history and advantages of the file format this app is built around – the PDF.

The emergence of PDF

Short for Portable Document Format, it has a legacy spanning more than two decades, with its first version released on 15 June 1993 by Adobe as a proprietary file format. What made its popularity soar to new heights was the ISO 32000-1, a Public Patent License, which allowed anyone to make, use, sell and distribute PDF-compliant implementations, without paying any royalties to Adobe.

What makes PDF so special today?

There are practical reasons for PDF being the de facto standard for electronic file types. Its capability to convert itself to print-ready graphics on paper, while preserving hyperlinks, images and text embedded within it makes it a versatile format. The cherry on top is its file size, which is much smaller than its JPEG counterpart, thanks to the data compression algorithms it uses.

Another factor is its OS independence, which allows it to look the same across all operating systems, making it more portable. Further, with recent versions of Android supporting PDF, its user base has expanded even more.

Having explained the PDF a little, it’s time to focus on the app itself.

The app’s interface (UI)

On opening PDF Convertor for the first time, the user is greeted with a blank screen, to which files can be added for conversion. There are a total of 24 conversion types to choose from, with 7 of them available as paid features. As of 25th November 2017, the full pack is worth 790 INR, while a la carte conversions are 250 INR each. I was especially interested in its capability to convert XPS to PDF, a hitherto locked feature for me. (XPS is the file format for the output plots generated by OrCAD PSpice, a software I use for circuit simulations, as part of my undergraduate course.)

Having unlocked the full pack, I set forth to use the app for converting the documents at my disposal.

Some of the in-app file conversions available.

There is also a batch conversion option, that allows you to generate a multi-page PDF, or vice versa, depending upon the conversion options at your disposal. I didn’t unlock this feature, since my conversions never exceed beyond a page or two.

An experience limited by Wi-Fi

Despite the well-laid design of the app with its easy to find menus, buttons and notifications, along with the slew of conversion options it has, I was unable to enjoy it to the fullest. The main reason for this is the Wi-Fi connection at my residence, where signal strength is pretty erratic. More often than not, when trying to convert any file, I get the following message:

Check your connection and try again.

Though I couldn’t carry out conversions at all times, it has been a satisfactory experience. All the conversions worked, whenever the Wi-Fi signal was strong enough.

My thoughts and suggestions

Having used different methods of PDF conversion for a while now, I have come to realize that every file conversion requires 3 steps –

  1. Upload files to a server
  2. Wait for the server to convert the files
  3. Download the converted files

The reason most conversion apps draw flak from the users is because they falter in step 1 itself. Not everyone has access to dedicated, high-speed Internet – especially users from developing and underdeveloped nations, making it a huge obstacle that developers need to overcome.

A related moot point is the use of browser web pages for the same task. For most users, who generally have to convert only a file or two, it seems more fitting to convert in this manner, rather than use a dedicated app for the same.

Keeping this in mind, PDF Convertor can incentivize its users into continued usage, by allowing them to create an offline queue for the files to be uploaded. As an analogy, we have YouTube Offline, a feature that allows users to create an offline queue of videos, which are downloaded as and when signal strength is sufficient.

Overall, I find this app an impressive one, and look forward to improvements in its UX.

External Links

  1. PDF Convertor on Google Play; the app:

  1. PDF, What is it FOR?; a video:
  1. PDF, Version 1.7 (ISO 32000-1:2008); a technical description:

  1. Document Management – Portable document format – Part 1: PDF 1.7; the 2008 documentation:

  1. Knowing When to Use Which File Format; an article:

Duolingo – an App Review

I recently acquired a brand-new phone – a Samsung Galaxy J7, as a replacement for my previous Nokia C6-01 smart-phone. The reason is pretty simple – I wasn’t able to install any apps on my Nokia phone, since its Symbian OS is not compatible with .apk files (the file extension for Android apps).

The first thing I did with my new phone was to install a few apps – Duolingo being one of them. Since I had come across multiple recommendations for this app, I decided to give it a try. Besides, I was looking for ways to improve my language proficiency in Urdu and Japanese.

Having used the app for a little while now, I feel that it deserves a review of its own – hence this article!

The interface – first impressions

One feature I really admire about Duolingo is its UI (user interface) – clean, simple and intuitive. When the app is opened the first time, the user is greeted with a plethora of options to choose from – German, Korean, English, Russian, and Japanese, to name a few. Depending upon the user’s language preferences, it offers these languages in different instruction modes.

Since my preferred language is English, I scrolled through the section for English speakers. To my dismay, I couldn’t find Urdu listed under any section, let alone the English section. However, it did list Japanese, which I decided to try out.

The UX (user experience)

Once a course is selected, the user is redirected to a test pertaining to the language. This is completed only after correctly answering a certain number of questions, following which some XP is earned, and a few ‘lingots’ – the currency used for purchases from the ‘Shop’.

Each ‘skill’, indicated by an egg icon, comprises of a number of tests, which must be completed in a similar fashion. Each test has multiple choice questions, translation tasks (audio and/or text), and word-match questions. The more questions the user answers correctly in a row, the more XP and lingots he or she earns.

While it may be used without registration, things get a little tricky when the user wishes to save his or her progress. In that case, app registration is required.

However, once registered, users are allowed to join a language club. These clubs have weekly leaderboards, which effectively gamify the app by creating an atmosphere of competitiveness.

Improving the app

If you’re looking for an app to learn languages in the form of a casual ‘game’, then Duolingo is the way to go. However, I wasn’t quite satisfied with the app, and probably had unrealistically high expectations from it.

In order to truly learn a language, one must not only read and listen to it, but also write it, and speak it. While I don’t mind jotting down words in a notebook, I don’t know whether my handwriting is legible or not. If there was a ‘capture’ feature in Duolingo to detect and identify text, it would be a big help in improving my Japanese handwriting.

When it comes to speaking the language, it is tough to comprehend the pronunciations correctly, even with audio read-outs of displayed words. For this, I suggest that IPA transcriptions be added to every word, and get the app to read out those transcriptions. This will go a long way in making the app’s experience more fulfilling.

Edit: After publishing this post, I came across TinyCards, which is another app developed by Duolingo. Its feature of allowing the creation of custom decks by users really impressed me.

In fact, I would go so far as to say that TinyCards is the perfect learning aid I have come across, for teachers and students alike.

Here is a deck of Urdu words I created, using this app:

Related links

  1. Duolingo on Google Play; the app:

  1. IPA transcriptions in Duolingo; a GitHub repo:

  1. Recognizing handwritten glyphs; a research paper:

Text Detection using Tesseract

Since the past couple of months, me and my colleague have been working on a research project.

The goal is simple – detect characters from a real-world image. However, the intermediate steps involved don’t make the task as straightforward as you might think!

Before discussing the technicalities of the project, it’s important to know what OCR is.

OCR – the heart of text detection

Short for Optical Character Recognition, it is used to identify glyphs – be it handwritten or printed. This way, all glyphs are detected and are separately assigned a character by the computer.

While OCR has gained traction in recent times, is not a new concept. In fact, it is this very technology that bank employees use to read cheques and bank statements.

For this project we chose Tesseract as our OCR engine. It has been developed by Google, and is what is used in their Google Keep app to convert images to text.

The project’s nitty-gritties               

We have limited our scope to printed text – specifically, street signs – and are attempting to convert the captured images to .txt files. This is how our code is intended to work:

If it works, then it would be possible to scale down the file size – a  very handy tool for storing names of places in smart-phones, which always come equipped with a camera these days. Ideally, such a task would be easy to accomplish, with perfect lighting, no perspective distortions or warping, and no background noise.

Reality, unsurprisingly, is quite the opposite. Hence, we are trying to process the images before feeding them to Tesseract, which is known to work best with binary (black and white) images.

According to our plan, we shall implement a three-step method:

  1. remove perspective distortion from the image
  2. binarize the image
  3. pass the image through Tesseract

Training the Tesseract engine

Before processing the images, the OCR engine needs to be ‘trained’ in order to work properly. For this reason, I downloaded jTessBoxEditor – a Java program for editing boxfiles (files generated by Tesseract when detecting glyphs). Since the project uses Ubuntu’s OS, I had to download and install Java Runtime Environment (JRE) to run jTessBoxEditor.

Since my portion of the project involves training the engine, I need to generate sample data for it. The engine needs to be fed samples of Times New Roman, Calibri, and Arial – the three fonts we came across in our images.

Our progress so far

Tesseract is still being trained, and the sample data is yet to be generated. After a while, realizing that these fonts would be available in my Windows installation, I copied the font files to Ubuntu, and successfully installed the fonts. One step down, several more to go!

On the image processing side, we are currently evaluating a Python implementation of ‘font and background colour independent text binarization’, a technique pioneered by T Kasar, J Kumar and A G Ramakrishnan.

I modified the code to work with python3, in order to avoid discrepancies between the various modules of our project. Here is the link:

A web forum also suggested that the input images be enlarged or shrunk, in order to make the text legible. This task requires ImageMagick, a software that uses a CLI (command line interface) for image manipulation. Therefore, I downloaded a bunch of grayscale text images (with the desired font, of course), and decided to convert all of them to PNG.

For some reason, I’m not able to do so, and have failed to convert any of them.

As an example, here is a sample command:

magick convert gray25.gif gray25.png

This is the error message I get in Terminal:

No command 'magick' found, did you mean:

 Command 'magic' from package 'magic' (universe)

magick: command not found

I’ve tried re-installing ImageMagick several times, but to no avail. I need to go through yet more web forums for a solution to this problem.

What’s the scope?

This is a question almost everyone asks whenever I discuss my project. Indeed, it doesn’t look very promising at first sight, due to the tedious nature of the steps involved.

However, its scope is quite vast – ranging from preservation of ancient texts and languages to transliteration and transliteration of public signage, and converting street signs to audio for the visually impaired. In fact, it may be used as a last resort for driverless vehicles to navigate an area when GPS fails.

We are only limited by our imaginations. Once merged with technology, they can be used to achieve miracles!

External Links

1.Font and Background Color Independent Text Binarization; a research paper:

2.Perspective rectification of document images using fuzzy set and morphological operations; a research paper:

3.jTessBoxEditor; a how-to guide:

4.AptGet/HowTo; a how-to guide:

Using Oracle’s VirtualBox – A Review

Of late, I have been tinkering around with Ubuntu. The reason? I needed to work on a Python project, and wasn’t making much headway into it.

Being a Windows user, I was finding it difficult to install the required Python modules for my project. This was especially exasperating with SciPy, a library that’s a prerequisite for almost all Python programs. Its latest distribution, unfortunately, is compatible only with Linux.

At the same time, I was apprehensive of even touching Unix, since it’s always spelt doom for my PC. Dual-booting Windows with any Linux or Ubuntu distro had caused, in the past, many a computer to crash – right in front of my eyes.

Hence, I had to overcome my apprehensions, tap into the hitherto alien Unix environment, and work on my project from there –  whether I enjoyed it or not.

While scrolling the internet for solutions, I stumbled upon VirtualBox, a VM(Virtual Machine) software by Oracle. Upon going through a few tutorials, I decided to give it a go.

What’s a Virtual Machine?

A virtual machine is a software that allows emulation of an OS (operating system). This way, the user can control one OS, while working within another OS. You may think of it as a case of one OS nested within another.

It’s amusing to think, “What if I run a virtual machine within my virtual installation? Is infinite nesting of OSes allowed?”

Ideally, such an experiment would be possible. In reality, hardware limitations would render it futile, since emulation saps up a significant portion of the host OS’s resources, such as RAM and memory. The hardware has to be divided between itself and the nested (also called guest) OS, a situation very similar to a dual-boot option.

As an explanation, I shall now use this infographic.


It’s clear that with more number of OSes, each nested OS shall have very little computational power at its disposal. In fact, OS 7 is a mere shadow of C64 (Commodore 64), which is itself an obsolete system by today’s hardware standards, since the latter requires at least 64 KB of RAM to operate.

A review of the installation

Here’s one feature universally appreciated about VirtualBox – it allows hassle-free toggling between the guest and host OS (in my case, Ubuntu) – all with a simple click of the mouse button.

This is especially useful to me, since I’m a staunch Windows user, and can’t stand Ubuntu’s interface for too long. Sure, Ubuntu allows for quick development of program code, but when it comes to good UI (user interface), I feel that its developers should borrow some design tips from Windows 8.1, which is the OS currently installed on my PC.

In fact, here’s what it looks like, along with VirtualBox:

Since my hard drive has around 500 GB memory, and 6 GB RAM, I’ve found it convenient to run a fully installed (virtual) version of Ubuntu, with 20 GB memory and 1 GB RAM allocated to it.

So far, it’s working well for me, and am quite satisfied with it!

Related links 

The VirtualBox website:

Installing Ubuntu within Windows using VirtualBox; a how-to guide:

Sharing files between VirtualBox and host; a how-to guide: