Android, Design, Mobile

MDM Introduction

This is the first blog in the series on building a custom MDM solution.

First of all, what are we trying to build?

We are building an instructional platform for K-12 education based on Android tablets. These tablets will be used by teachers and students, primarily in a classroom. The tablet helps the teacher in facilitating a class and provides various tools and applications to interact with the students and help them learn better.

So, what is a MDM and why do we need it?

MDM stands for Mobile Device Management. Given that these tablets will be deployed in various schools, the MDM solution lets us manage them remotely. This includes downloading various learning content, installing educational apps, setting password policy on the tablet, et al, remotely. Because the tablet is used in a school environment, the MDM solution is also responsible for providing a safe and secure container for running the applications. This includes enabling/disabling certain settings on the tablet. The MDM solution also provides a single sign on solution for the tablet user. This allows us to capture the user credentials once and use it for the course of the user’s lifetime. This also gives us the ability to customize the content on the tablet to that specific user.

So, how did we implement it?

mdm-server-architecture

The MDM solution has two primary components, first, the MDM agent, that lives on the tablet and second, the MDM server. At a high level, an Admin user interacts with the MDM server to perform an “operation” on a tablet (or a group of tablets) via a web portal. An example of an operation would be “install an application” on a tablet. In order to execute this operation, the MDM server notifies the tablet via GCM, the MDM agent on the tablet receives the notification, requests the MDM server for its operations via HTTP, the MDM server sends the operations and then the MDM agent executes those operations on the tablet and sends the status back to the server. The status is then displayed to the Admin user on the web portal.

Other examples of operations include:

  • Blocking the settings on the tablet
  • Enabling VPN
  • Reset the password on the tablet
  • Sending a message
  • Unregistering the user on the tablet
  • Factory resetting the tablet

and many more.

In order to build some resilience in to the system, we have introduced retry mechanism on the server. If the server does not get a response from the tablet in a configured amount of time, we retry sending the notification to the tablet. We have also built some “pull” in to the tablet, which means, every time the tablet is rebooted, it will request for its operations from the server. The operations are idempotent which means if they are executed multiple times, the end system state is the same.

In the next part, I will talk about the tablet architecture, which requires some cool “hacking” on the Android operating system. Here’s how!

Advertisements
Standard
Android, architecture, Design, Mobile

Mobile Device Management

For the last year or so, I have been incredibly lucky to be working on building a custom MDM (Mobile Device Management) solution for Android tablets. This is a blog series, where I talk about the general architecture of the MDM solution and the specifics of the Android tablet component and the server component of the MDM solution.

Its broken down in to 5 parts. Here they are:

Introduction
Tablet Architecture
Operation Serialization
Tablet Compliance
Operation Processing Workflow

Hope you enjoy the series. Comments welcome.

Standard
architecture, Craftsmanship, Design, Java, Refactoring

Soul coding

Yesterday I had a 12 hour non-stop[1] code fest to refactor a thin slice of 2-tiered web application into a 3-tiered one. It was very productive and I must say this is the kind of stuff that soothes my developer soul and hence the name. 🙂

The primary driver for the refactoring was that the core logic of the application was tightly coupled on both ends to the frameworks being used. On one side, it was tied to the web application framework, Play and on the other end the ORM, Ebean. We managed to move the business logic in to a separate tier, which is good on it own, but also let us unit test the business logic without bothering with the frameworks, which can be frankly quite nasty. As a follow on effect, we also managed to split the models into 2 variants, one to support the database via Ebean and the other to generate JSON views using Jackson. This let us separate the two concerns and test them nicely in isolation. Similarly, we could test the controllers in isolation. We got rid of bulk of our functional tests that were testing unhappy paths for which we had unit tests at the appropriate place viz., the controller, view model, service and database models.

I was quite amazed at how much we could get done in a day. Here are some of the important take aways from this experiment:

  • We had a discussion the previous day about how we wanted to restructure the code, primarily focusing on separation of responsbilities and improving unit testability of the code. We agreed upon certain set of principles, which served as a nice guideline going into the refactoring. On the day of refactoring, we realized that not all the things we discussed were feasible, but we made sensible adjustments along the way.
  • Keep your eye on the end goal. Keep reminding yourself of the big picture. Do not let minor refactorings distract you, write it down on a piece of paper so you dont forget.
  • Pairing really helps. If you are used to pairing, when you are doing refactoring at this scale, it doubly helps. Helps you keep focused on the end goal, solves problems quickly due to collective knowledge and also decision making cycle time is considerably reduced when making adjustments to the initial design. Also I would say pick a pair who is aligned with you on the ground rules of how you are going to approach development. You don’t want to get into a discussion of how or why you should be writing tests and what is a good commit size.
  • Having tools handy that get you going quickly. Between me and my pair, we pretty much knew what tool to use for all the problems at hand. At one point, we got stuck with testing static methods and constructors. My pair knew about PowerMock, we gave it a spin and it worked. And there it was, included in the project. Dont spend too much time debating, pick something that works and move on. If it does not work for certain scenarios, put it on your refactoring list.
  • Thankfully for us, we had a whole bunch of functional tests already at our disposal to validate the expected behavior, which was tremendously useful to make sure we weren’t breaking stuff. If you dont have this luxury, then pick a thin slice of functionality to refactor which you can manually test quickly.
  • Small, frequent commits. Again the virtuosity of this is amplified in this kind of scenario.
  • Say no to meetings. Yes, you can do without them for a day, even if you are the president of the company. 🙂

Have you done any soul coding lately? 🙂

[1] Ok, not quite 12 hours, but it was on my mind all the time. 😉

Standard
Craftsmanship

Pragmatism hurts too…

I spent quite a bit of my early career learning pragmatism. Frankly I did not know the word until I entered the industry. ;). I got this as a feedback, quite a few times, from my colleagues and more so from clients. They would say “You need to be more pragmatic”. What that really meant is, stop being so obsessed with you software engineering “craft” (my clients didn’t even think it was craft :)). Its ok to not have some part of the code tested before it hits production. Its ok to not have automated tests for some part of the code. Its ok to not pair program on critical code. Its ok to have 2000 lines of classes, with no tests, if it “already works”. Now I get it, right. All this does make sense, in a given context.

But, pragmatism hurts too. In this context, pragmatism means implicitly consenting to bad behavior. And again this might be ok, when you are in a bit of crunch, but when it starts becoming the norm, it begins to hurt, hurt a lot. When you see it becoming a norm, you need to start becoming a zealot for code quality. Best would be to have a good mix of zealots or for positive connotation, “craftsmen or craftswomen” on the team.

So, dont forget to promote craftsmanship just as you would promote pragmatism, especially if you are the leader of your software team.

Standard
Cloud computing, Infrastructure

Cloud vendor migration

Recently we migrated our cloud infrastructure from Amazon to a different cloud vendor. I won’t get into the details of why we had to do it, but the experience of the migration itself was interesting and I want to provide some guidelines around the things you should consider, particularly around infrastructure automation, if you find yourself in a similar situation.

Going into this migration discussion, we clearly knew that Amazon was a better cloud vendor than the new one. We looked at a comparison site[1] to compare the features of the two cloud vendors. Amazon was a clear winner. This comparison gave us some pointers of where the new cloud vendor would be lacking. But rather than focusing on individual features, we decided to come up with own “requirements spec” for our infrastructure, and then see how the other cloud vendor fares. We knew we would have to make some compromises, but most importantly, we understood where we would not make compromises at any cost.

Our application is a fairly straight-forward Rails app backed by a Postgres database and Memcache, hosted on Amazon virtual machines. We use a lot of Amazon services like S3 (storage), ELB (Load balancer), Route 53 (DNS), SES (Email server), et al.

One of the big things we were concerned about from the get go, was the “ease of automation” to setup our infrastructure with the new cloud vendor. Our existing infrastructure setup is automated to a large extent using Puppet. Our infrastructure setup falls into three steps: Provisioning, Configuration & Deployment. I will explain these in a bit. These steps have different degrees of tolerance for automation and we decided early on which of these should be “should-be-automated” versus “must-be-automated”. Lets talk about the steps:

1) Provisioning
This is the first step in infrastructure setup which involves creating virtual machines or in technical parlance “provisioning” them. Once you have provisioned an instance, you get a virtual machine with base OS installed on it and an IP address and credentials to access it. For our new cloud provider, this would be a manual step whereas Amazon lets you automate this piece very nicely, if you use the AWS API. We thought this falls under the “should-be-automated” because we did not see ourselves spinning up new machines frequently. Surely we were giving up the capability of “auto-scaling” our infrastructure, but we were ok with it. The way auto-scaling works is that, Amazon is going to monitor the load on your machines and automatically create new machines to handle extra load. It is actually a pretty cool feature, but we thought we did not need it, not at least in the near term.

2) Configuration
This is the step where a raw virtual machine is transformed into something that it is meant to be. So for example, a virtual machine that is supposed to be the database server would have the database server software and all the other pieces it needs installed on it. This part is probably the most complicated or rather time-consuming to set up, because it involves configuring a virtual machine to be either a application server, web server, database server, cache server, load balancer, email server, router, et al. We did not automate all of it to start with, like the email server, router, et al because they are pretty much one-time setup activities and we did not find it worth our time. So this step falls somewhere in between “should-be-automated” and “must-be-automated”. As I explained before some of the things like email server, router are one-time setup activities and we would be ok with not automating them. But for things like web server, database server, these fall under “must-be-automated” category because we set them up and tear them down frequently, not just in production but in all the downstream environments like staging, integration, development. The other advantage is, if we were to bring up new servers (web or database) in response to scaling or outage situation, we should be able to do it fairly quickly and most importantly an exact replica of what we had before.

3) Deployment
The last step in the process of infrastructure setup is application deployment which falls under the “must-be-automated” category. Deployment means every time we make a change to our code base, an automated process would build the code, run the tests, and deploy it to all the different machines like web server, application server, database server, et al. Having this step automated is the cornerstone of continuous delivery, which is something we highly value. Continuous delivery means being able to deploy changes to an environment quickly and with least manual intervention. This gives us the ability to make rapid changes to production environment, get feedback quickly from users and make changes accordingly. Luckily for us, this step with our new vendor was going to be fully automated, else that would have been a showstopper.

The other things that we considered when moving to the new cloud vendor were:

  • How do we migrate data to the new cloud infrastructure?
  • What are the data backup solutions available with the new cloud provider?
  • Is the new cloud PCI compliant?
  • What are the SLAs (Service Level Agreement) for the new cloud? What are the escalation routes? Who will the development team have access to when emergency arises?
  • Does the new cloud use OpenStack?
  • Does it provide services like email server, load balancer, router, et al or do we have to build these ourselves?
  • Does it support encrypted backups?
  • What kind of file storage does it provide? Does it provide streaming capability for say video, audio, et al?
  • Does it provide identity and access management solution?
  • What kind of monitoring solutions does the cloud vendor provide?

[1] http://cloud-computing.findthebest.com

Standard
Agile, Programming

Tasking

I have been following this seemingly innocuous practice of “tasking” when programming on a story, which I find very useful and I recommend you try it. Here are the what, when and whys of tasking.

What: Breaking the story into small tasks that could be worked on sequentially.

When: Before writing the first line of code for the implementation of the story.

Why:
Understanding: By breaking the story into smaller chunks, it gives a good grasp on the entire scope of the story. It helps define the boundary of the story and gives a good idea of what is in scope and out of scope.
Completeness: Tasking makes you think of the edge case scenarios and is a conversation starter for missed business requirements or even cross functional requirements like security, performance, et al.
Estimation: Doing tasking right in the beginning of the story gives a sense of the story size. On my current project we are following the Kanban process. Ideally we would like to have “right-sized” stories, not too-big not too-small. Tasking helps me decide if the story is too big and if it is, then how could it be possibly split into smaller ones.
Orientation: This has been a big thing for me. I feel I go much faster when I have a sense of direction. I like to know what is coming next and then just keep knocking off those items one by one.
Talking point: If you have 1 task per sticky which I recommend, it serves as a good talking point for say business tasks vs refactoring/tech tasks, prioritizing the tasks, et al.
Pair switch: If you are doing pair programming, like we do, then you could be switching pairs mid way through the story. Having a task list helps retain the original thought process when pairs switch. Stickies are transferable and they travel with you if you change locations.
Small commits: Another big one. Each task should roughly correspond to a commit. Each task completion should beg the question “Should we commit now?”. If you are doing it sooner, even better.
Minimize distration: There is a tendency as a developer to fix things outside the scope of the story like refactoring or styling changes to adjacent pieces of code. If you find yourself in that situation, just create a new task for it and play it last.

Thanks for reading and feel free to share your experiences.

Standard
Design, Programming

Simplicity via Redundancy

Typically when we do a time versus space tradeoff analysis for solving a computational problem, we have to give up one for the sake of the other. By definition, you trade reduced memory usage for slower execution or vice versa, faster execution for increased memory usage. Caching responses from a web server is a good example of space traded for time efficiency. You can store the response of a web request in a cache store hoping that you could reuse it, if the same request needs to be served again. On the other hand, if you wanted to be space efficient, you would not cache anything and perform the necessary computations on every request.

When I am having a discussion with my colleagues, this time versus space tradeoff analysis comes up quite frequently ending in a time efficient solution or a space efficient solution based on the problem requirements. Recently I got into a discussion which gave me a new way of thinking about the problem. How about if we also include simplicity of the solution as another dimension which directly contributes to development efficiency. In an industrial setting, simplicity is just as important or the lack thereof costs money just as time or space inefficient solutions would.

So the recent discussion I had was about a problem that involved some serious computation at the end of a workflow that spanned multiple web requests in a web application. This computation would have been considerably simpler, if we stored some extra state on a model in the middle of the workflow. It started to turn into a classic time versus space efficiency debate. Time was not an issue in this case and hence giving up that extra space was considered unnecessary. But I was arguing for the sake of simplicity. Quite understandably, there was some resistance to this approach. The main argument was “We dont have to store that intermediate result, then why are we storing it”. I can understand that. If there is no need, then why do it. But if it makes the solution simple, why not?

I admit there might be certain pitfalls in storing redundant information in the database, because it is not classic normalized data. Also, there could be inconsistencies in the data, if one piece changes and the other piece is not updated along with it. Also it might be a little weird storing random pieces of information on a domain. Luckily in my favor, none of these were true. The data made sense on the model and could be made immutable as it had no reason to change once set and hence guaranteeing data consistency.

Extending this principle to code, I sometimes prefer redundancy over DRYness (Don’t Repeat Yourself), if it buys me simplicity. Some developers obsess over DRYness and in the process make the code harder to read. They would put unrelated code in some shared location just to avoid duplication of code. Trying to reduce the lines of code, by generating methods on the fly that are similar to each other, using meta-programming can take it to a whole new level. It might certainly be justified in some cases but I am not sure how many developers think about the added complexity brought on by this approach.

A good way to think about reusability is thinking about common reason for change. If two classes have a similar looking method, and you want to create a reusable method and have it share between the two, then you should also think about if those two classes have a common reason for that method to change. If yes, then very clearly that method should be shared, if not, may be not so much, especially if it makes the code hard to understand.

I feel redundancy has value sometimes and simplicity should get precedence over eliminating redundancy. Thank you for reading. Comments welcome.

Standard