Pythagorean triples: Do it right

What: Minimal lines of code for calculating the length of integer-sided right triangles with a side length below a given threshold
Why: Functional programming paradigm and vector handling in different languages
How: Write minimal examples for: Frege, Java, SQL, R, Python, Javascript. Please contribute!

Last week I went to a talk, where Frege was introduced. Frege is a purely functinal language based on Haskel. I once looked at Haskell and the introductory example was the famous pythogorean triples. Thats also mentioned on the Frege page. I was asking myself: How can this be done in Java, or R or SQL?

Here is my list of implementations. Please contribute if you know more or have a better (shorter) version. Line breaks are inserted for better layout. All implementations return something like:

Frege

This is not tested. I am not sure, what Frege says about the inserted line breaks.

Java

Tested.

For the Java version: Lets create a model class first. This makes the stream handling more easy. It is just some sugar.

Now, lets write the logic:

SQL (Oracle)

Tested.

R

Tested.

Python (3)

Tested.

Javascript

Tested.

Creation of filled arrays: See here.

Integer division: See here.

Fly me to the stars: Interactive graphs with Docker, Jupyter and R

What: Documented data analysis on the fly
Why: Keep documentation, implementation and operation in one document; create publication ready reports; explore data interactively
How: Use Docker (containerization), Jupyter (documenting and code execution), R (analysis) and Plotly (interactive plots, very interesting also for web-frontend Javascript graphs)

You have to install Docker to run this tutorial. All dependencies are automatically installed in the container. Otherwise, you have to install the dependencies manually. Additionally, you need a browser to get access to Jupyter. The interactive plots work best in Chrome. I had some issues running it with firefox.

Copy the following content in a file called Dockerfile:

In the same directory, you need the file installJupyter.R with the following content:

Now, run the following two docker commands in the same directory where the docker file is located:

You should now have the running docker instance available on the docker VM, port 8888:
screenshot_jupyter

Via the „Upload“ button, upload the Notebook from here: Interactive graphs with Jupyter and R and open it. Execute all steps via „Cell“ -> „Run all“. Now, it should look like:

screenshot_jupyter_plotly

Did you noticed the interactive plot? You can zoom, export as png, have tooltips, … All without anything to programm. Cool, isn’t it?

Ok. Your customer is impressed but does not like the modern html stuff? He wants to print it out and send it away via mail? No problem. Just change the interactive plots to none-interactive ones and export to pdf via „File“ -> „Download as“ -> „PDF via Latex“ (that ’s why the docker file above contains all the stuff like pandoc, latex, …). You will get nearly publication ready report out.

Win a card game against your kids with OCR and statistics

What: OCR & R; Analyze standardized hardcopy forms electronically
Why: Win a card game with a lot of cards to remember (car quartett)
How

You need:

  • The card game
  • A scanner
  • Gimp
  • A linux machine
  • An hour free time

1. Setup a virtual machine or an existing linux

I used Ubuntu Xenial64bit. Maybe, you have to adapt the steps a little bit.

  1. Install tesseract
  2. Install Java

2. Scan the cards

Scan all the cards. Make one image per card. Make sure, you placed the cards all the time at the same position in the scanner (for example upper right corner). All images should have the same resolution and size. Thus, the same regions on the image should correspond to the same region in all the cards (for example a name field). The cards can look like this (you can guess what game I played ;-)):

You have to tweak the images a little bit to get good OCR results. Here is what I did:

  1. Use Gimp to blur the images (Filter ⇒ Blur ⇒ Blur)

3. Basic image processing with Java & tesseract

The cards I scanned had some defined regions with numerical or text values (see figure above). You can enhance the OCR results dramatically, if you know ehere to look for the text. Create a small class containing the information about each region. This class should also contain a flag if you look for text or numerical values.

Set up the regions for your image as you like. Thats what I used for the cards (coordinates are pixels in the scanned image).

Now, process each image. For each image I did the following steps:

  1. Read in the image
  2. Transformed image to black/white ⇒ Better OCR results
  3. Do some basic clipping to extract one image for each region on the card you are interested in. Start tesseract for each image.
  4. Collect the results from the text file output of tesseract (or read directly from the output stream of the process). Maybe, there is also a batch mode of tesseract???

4. Analyze the results with R

Lets assume you have the results now in a list like this:

You can read in the values easily in R with:

This is read in as a data frame. You can get a first impression of the best values with the summary function:

Write a simple function to get the values for all categories:

And apply it:

And voila: You have to remember just a few cards with the highest values. Next time I ask for „Zylinder“ in case Ihave the „Scania R620“.

Http-Server for jersey app with Java standard tools

What: Create a simple http server for a jersey app without traditional application servers.
Why: Kiss & useful for micro services
How: Use built-in java classes

Download the following dependencies (or include them in your pom.xml):

7-line Java way, minimal

Usually, you need a little bit more like:

  • Cors support
  • Jackson
  • Logging

Create the following class:

Add the following lines:

Full example:

Chess in space, wondrous stuff and IT