Sunday, 20 November 2016

How to get started with development in the cloud for free: tools for blob storage development

Getting started in the cloud might be an overwhelming experience. When you are learning to work with the cloud, you also want to keep your development costs as low as possible. Therefore I will present with this and the following blog a use case that will allow you to write your first cloud blog storage program at no cost. We will use for this the cloud offering of Microsoft which is called Azure. 

The use case will exist of uploading a file to the cloud so that a special cloud service would be able to process this file and return the processed result which you eventually can download again.

In this first blog I will present all the different development tools that you need to develop this use case. We will use an emulator for the cloud in this way you can postpone starting to use your free Azure credits when you have a better idea about your first cloud project. 

What is Blob Storage
Blob storage is a service for storing large amounts of data or files in the cloud. Such files are called blobs in the cloud. You can see this as your personal hard drive in the cloud. These files can be made accessible through http or https and you can make them publicly available to the world or you can use it to store private application data. 

To be able to use blob storage in the cloud, you need a storage account. In your storage account you will have containers what you can compare with directories on your computer that are storing the blobs. If you would store a blob file myblob on the account myaccount, in container mycontainer

The url of this blob looks as follows: . The equivalent of this on your windows account myuser is a file myfile in directory mydirectory is the file C:\Users\myuser\Documents\mydirectory\myblob.

The blob storage emulator will in this case also emulate the storage account. 

Use case
We are building the application that is shown in the picture below. We have locally on our computer the file C:\test_files\Awesome_local_file.txt. We have a web service in the cloud that performs a special operation on this file that we can't perform on our file locally. Therefore we will need to upload the Awesome_local_file.txt to my_container as the blob larf_YYYYMMDDHHmm.txt . In this way the Web service will be able to access the file and perform his operation on it. The web service will provide as output the olarf_YYYYMMDDHHmm.txt file which it will store in the container mycontainer. Afterwards you will be able to download this file back to your computer as Awesome_output.txt

Needed free resources for blob storage development
Development Environment
This part is based on Get started with Azure Blob storage using .NET  As the development environment we will use Visual Studio. Visual Studio Community is a free version of Visual Studio. 

When you have installed Visual Studio, open Visual Studio and select New Project in the left column and select a new Visual C# Console Application as demonstrated in the picture below and press OK.

When you have created your project, you will need to add some extra libraries. You will be able to do this with the Nuget Package Manager. You can launch it by Selecting Nuget Package Manager from the Tools menu. Next you can select the Nuget Package Manager Console.

In the console in the bottom type the following two commands

  • Install-Package WindowsAzure.Storage
  • Install-Package Microsoft.WindowsAzure.ConfigurationManager
Azure Storage Emulator

For this use case we will use a blob storage emulator, this will mean that your computer will emulate the blob storage environment in the cloud locally and you don't need to generate an Azure account yet. To be able to use the Azure Storage Emulator, you first will need to have an instance of SQL Server installed.  A free option is offered in SQL Server 2016 Express

When SQL Server has been successfully installed, you will be able to install the Azure Storage Emulator. When you have installed the storage emulator you can start it from the program menu. You will see a command window similar such as the one below when everything has run correctly.

Authenticating requests against the storage emulator
The benefit of using the storage emulator is that the account name and the account key are the same for all the developers using the storage emulator, so these ones don't need to be kept secret. These values are: 
Account name: devstoreaccount1
Account key: Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==

When you want to use this emulator in Visual Studio, you need to put the following values in the app.config file. But we will cover this more in the next blog when we are building up the use case.

Microsoft Azure Storage Explorer
To be able to access your files in storage with a desktop application similar to Windows Explorer, you can install Microsoft Azure Storage Explorer,  To connect to your storage emulator, use the Account key Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw== and the account name devstoreaccount1

The Microsoft Azure Storage Explore will look now like the picture below. 

Congratulations, now you have set up all the different parts to be able to get started with blog storage. In the next blog we will show step by step how you can build the use case. Good luck with your journey to the cloud.

Sunday, 6 November 2016

Efficient debugging of cloud applications using an Azure case study

In the current world of software development cloud computing has taken an important place. The triage of software crashes however has become more challenging because of the use of the cloud. Software crashes never happen at a good time, therefore it is important that you beforehand know where you can get all your logs. So that you can build an efficient triage strategy.
In this blog post I will explain some tips learnt from analyzing crashes from a cloud data solution that I recently developed. This data solution was a web service derived from Azure Machine Learning Studio (Azure ML) that consumes and produces blob storage files. Blob storage is the data storage solution from Azure. Although some of the tips are only related to Azure ML and blob storage, they can also help you out to build an efficient triage strategy for other cloud solutions.
In the next part of this blog I will first describe the cloud architecture, then the different logs that are available and how you can access them and finally, I will discuss some bug triage strategies.
High level overview of the cloud architecture

Web service from Azure Machine ML

In the picture above, you can see an overview of the architecture that I built around the web service derived from Azure ML. The only important thing that you need to know about Azure ML is that it provides you a lot of different data science and machine learning algorithms in an easy way to solve complex data problems. Such a solution is called an experiment in Azure ML. An experiment will consist of different components that are connected to each other. When your experiment satisfies the requirements, you can deploy it as a web service and make it part of your cloud solution.
Web services can both be consumed in a Request-Response Service and in a Batch Execution Service (BES) mode. The first one means that you only provide one data point to be evaluated by the web service and the second one means that you provide larger volumes of data to be consumed by the web service. In this example I use the BES mode.

Feeding the web service from blob storage

The web service is fed with files from Azure blog storage. Azure blob storage is a service for easily storing files in the cloud. In the current architecture, I make changes of my files locally, then these files are automatically uploaded to blob storage with a script. The web service consumes these blobs and produces new blobs. The blobs are finally downloaded to my local machine.
Multiple instances of this web service can run at the same time and if the maximum number of simultaneous instances has been reached, the instances are queued till one has been finished. In a perfect world, everything runs perfectly and my blog post would end here. However, bugs and crashes are part of real life. Therefore, I will provide now some tips that can help with efficient triage of bugs so that you can deal with them when they appear in production.

Have access to all the evidence

Know which logs are available and how you can acquire them

One of the most important things before you can start the triage process is having access to all available logs. On the one hand, the cloud components that you are using might be generating client side logs that you can easily access from your cloud portal. They might however not be turned on by default. On the other hand, your cloud provider also might be generating server side logs that you might get access to in case of need. Finally, you also must make sure that your own application generates useful logs that you also save in a consistent way.

Turn on client side logging in Azure ML

To access your logs, you need an Azure account. The client side logs are not automatically turned on in Azure ML. You can turn on these logs through the Classic portal or through the Azure Machine Learning Web Services portal. I prefer to use the Azure Classic Portal. You will be able to find these logs in the ml-diagnostics folder in your blob storage.

Get familiar with the log file structure

Your log files will be stored in a blob called ml-diagnostics and each web service will have a separate unique identifier. When you are running your web service, you will see folders with each run of your web service.
If you look into the folder belonging to the run of a web service. You will find files that are structured in the following way ‘COMPONENT_TYPE’_’NB’.stdoutand ‘COMPONENT_TYPE’_NB.stderr. In the stdout file you will find normal output information and in the stderr file you will find the error information.
Examples for the ‘COMPONENT_TYPE’ are Apply%20SQL%20TranformationExecute%20R%20ScriptJoin%20Data. Which refer to different components like SQL Transformations, R scripts and Join components that you can add in Azure ML.
It is however still a challenge to know which ‘NB’ corresponds to which component in Azure ML. For the Python script and the R script components you can solve this easily by adding in an extra print command with a useful identifier that makes it easier to determine which of the components you are looking at. For the other ones you need to dig deeper into the file structure.

Save your local logs

Also, make sure that you keep track of your local logs in which you are calling the web service. This will help you out later on with debugging issues. Some information that is useful here are: time stamps, unique identifiers, files that you are saving, errors thrown by the web service. Also make sure that you keep the information that you need to revert changes in case of a crash.

Save your local logs

Last but not least, make also sure that you keep track of your local logs in which you are calling the web service. This will help you out later on with debugging issues. Some information that is useful here are: time stamps, unique identifiers, files that you are saving, errors thrown by the web service. Also make sure that you keep the information that you need to revert changes in case of a crash.

Ask an example of a cloud side log in a normal case

At the cloud side there also might be extra log information available that you can’t access yourself directly. If your web service is a critical component of your data solution it might be a good idea to ask support for these log files. That way you have an idea what information you can derive from these log files. You can then also ask for these logs when you run into a problem.

Catching the bugs

Is it a missing data issue, a data formatting issue or something else?

The difference between a data issue and a formatting error is tricky. A formatting error means that there is an error in one of the input files which means that the web service couldn’t run till the end. A data issue means that the data was uploaded to blob storage but that a component of the web service started running before the data was available. This also means that the web service couldn’t run till the end. Besides this you also can have a cloud side error which also will mean that your web service didn’t run till the end.
After some trial and error, I have established the following steps to determine the root cause of the issue.
  1. 1. I go in Azure Machine Learning Studio and I run the Azure Machine Learning experiment with the input blobs that caused the bugs.
  2. 2. If the experiment runs till the end smoothly I will examine the log files from the last component and will look for something that says 0 rows. If I find this, this means that a component started to run before the data was available. In the other case there might be a cloud side issue which I will discuss later.
  3. 3. If the experiment fails, I now know on which component it failed. During development of the R and Python components I made sure to add in enough logging information and this will easily help me to track down the issue and resolve it. The fix will mean fixing the local input file and uploading the fixed file to the cloud.
In case these two strategies don’t work you might have run into a cloud side error.

Cloud side error

There exist two types of cloud side errors, the glitches and the systematic errors. A cloud side error will always start as a glitch because there always will need to be a first time that something bad happens. Make sure that you store the information of this error and try to memorize the error. If you don’t see the error happening again, then it was a glitch.
However, when you see the same error happening more than once you might have run into a systematic error and it is time to call the Azure ML support team for help. Make sure in this case you provide all your local logs and your client side logs which will enable you to nail down the issue in the fastest possible way.


I hope with these tips you will be able to discover new pieces of logging information of your cloud architecture. Hopefully, you will never need them, but that they will help you out in the unfortunate case that you run into a crash of your cloud application.

Tuesday, 20 September 2016

Four reasons not to go to a technical Meetup

Don’t let practical obstacles prevent you from going to a technical Meetup.
Professional development is something that we all need to spend a lot of time on if we want to keep up with the advances in technology and try to advance in our career. Most of the time this is a pretty individual and lonely task. Did you know there are many technical groups that meet up on a regular basis? This could be a great way to learn and also network with others.

Meetup is an easy and straight forward place to start looking. Make an account on Meetup, add in your interests, go to some interesting meetings and thanks to the machine learning algorithm, a lot of interesting suggestions will come to your inbox every week.

Fear, time commitments, not exactly knowing what is available and whether it will interest you are all barriers that might be holding you back. I will share four fears that might be holding you back from going to a meetup and will explain how to overcome these fears.

But … I don’t know anyone!

For everybody there has always been a first time going to a Meetup and changes are high that you won’t be the only new person going. You can also just ask whether a friend can join you.

If you are not fond of hours of networking, you can pick a Meetup with a lot of talks in it. In case you really fear talking with a lot of unknown people, look at the agenda and arrive just a few minutes before the talk is scheduled to start. In case there is no real agenda available for the meeting, talks normally start 10 minutes after the start of the meeting and in case the talk is started when you arrive just sneak in at the back. If you enjoyed the talk and would consider going a second time, there will be certainly already a lot of familiar faces.

But … I don’t know enough about the topic!

On the internet there is probably months worth of reading material available on your favorite topic. Every week a few extra months worth of reading material will be added and there are a lot of books in the library and e-books for your tablet available too. I even forgot to mention all the online courses that you can take as well as all the podcasts that are available. If you are studying alone and try to read everything available, you don’t really have the feeling that you are progressing, because indeed there will be always that extra piece of information that you haven’t consumed yet. Do you think the people at the meeting have really read every piece of available information? Nope!

Therefore, it is also important to talk with people who share interests in a similar topic. In this way you can share ideas about what you have read and compare your knowledge levels. You can determine what you are good at and which parts you still need to improve on. You also can get hints about material that you should explore further.

Listening to a presentation or watching a demo, will allow you to remember more about the topic. You also have the possibility of immediate interaction if something is not really clear.

You will also learn about various speaking styles. These you can try out at your job when you are discussing your own work. In case the meeting is not in your mother tongue, it will also allow you to learn these terms in your new language. It will also help you to learn the correct pronunciation of the terminology.

But … I won’t find the location of the Meetup!

It might indeed be a challenge to find the location of the Meetup. There might be construction going on. The signs to the correct location might be missing. But please don’t let this be a reason for not going to a Meetup, because you already have gotten this far. If you have a hard time finding the location of the Meetup, you are probably not the only one. Just briefly mention to the organizers when you eventually arrive that the place is not that easy to find. In this way they can make sure that they give better directions for other people still coming and they can make accommodations for future meetings. Also make sure that you have your smartphone charged with you in case you get lost, Google Maps is your friend.

But … I will be the only woman there!

This is a reason that might be holding back a lot of women unfortunately and will eventually become a self fulfilling prophecy. You actually might be surprised about how many technical meetups have at least one woman in their organizing team. In these Meetups you never need to feel alone as a woman. Also, by going to such a Meetup, you are supporting them.

But also don’t be scared if you are going to a Meetup where you might be the only woman, the men will be really happy that you are going and will be really friendly and welcoming. Imagine the power that you have in this case, just by you attending, you enable gender diversity in the group.

Another approach might be to send a message to one of the women in tech initiatives in your area and ask whether there is anyone who wants to join you to a certain meetup. In this case you can convince other women to join who were struggling with the same objection.

Fast forward a few months …

I am convinced, I went, loved the meetup, can I give a talk myself?

Congratulations! You certainly can give a talk at your favorite Meetup. Let the organizers know and they will give you a time slot. Good luck with your talk.

Saturday, 30 January 2016

How to (re)light the mathematical flame in you.

Originally posted on 

Nowadays there is a massive growth of data which is leading to mathematical modeling appearing in a growing number of industries. This is the perfect time to become curious about all those things you learnt in your mathematics classes and courses. To help you with this, I’m sharing some fun mathematical resources to explore.

Study of some surprising social networks

Everyone knows the standard social networks LinkedIn and Facebook. A lot of fancy mathematics is done behind the scenes to help you to find people you may know, to check how you are ranked between your contacts and
your company.
This type of social network analysis can also be applied to some surprising situations, for example: the Marvel Superhero network, the Movie Actors Network and a social network from the characters in Star Wars. For all of these networks the heroes, the actors or the characters will represent the nodes of the graphs. Connecting the various nodes is different for each of these networks. For the superhero network, there will be a line between two characters when they occurred significantly in the same comic book. For the actors if they played in the same movie. Finally, for the Star Wars characters if they spoke to each other in a movie.
In the links that I mentioned you can see the colorful visualizations of the derived networks and some insightful mathematical analysis. If you want to play with the super hero network, this is a useful resource. The latter will also
help you to gain some hands-on experience with Spark. Spark is an engine for processing very large sets of data. It has a machine learning library, this means a lot of mathematical programs that can be used to find patterns in large sets of data. If you want to make beautiful graph visualizations, you can use gephi. Gephi is an open-source software for visualizing and analyzing large networks graphs.

Recreational Mathematics

Do you just want to play with mathematics for fun or want to dust off your math skills? Recreational mathematics might be something for you and you can pick from various flavors: mathematical games,
mathematical puzzlestessellationsfractalsRubik’s cube solving, ….
Various people write blogs about their fun with recreational mathematics. Here are some examples a blog of a friendanother friend and a blog on math blogs. Maybe you’ll even be inspired to write your own recreational math blog.

Mathematics clubs

Do you want to discover new fields of mathematics and grow your professional network? You might have been part of a mathematics club when you were at university. Similar versions of those clubs still exist for math lovers in the working world.
The Meetup group KW Intersections is an example of such a group. They talk about various topics in research in mathematics and computer science. Some topics of the last two years are a geometrical proof of π, quantum computing, neural networks, and type theory. The Meetup offers so many different colours and flavours of mathematics.
It is amazing about how much energy comes out of such a group. Hearing someone giving a talk about their passion, makes you look again into your own projects, or let’s you dive deeper into a paper so that you can explain the paper yourself in front of a group. You don’t have time to join this Meetup? No worries, one of hour members films a lot of the talks and puts them on his channel.


I hope you found these resources fun and interesting and that you have learnt the many places math can pop up — and when that happens it can be fun! Not all mathematics have to be useful, sometimes it can just be fun.
Do you have any fun mathematical resources to share? I’d love to hear all about them. Just add a response below this article.