Tech Voice: 2008

Tuesday, August 19, 2008

Free the Air Waves

As more and more TV stations move to digital only broadcasting, they vacate the frequencies used for analog transmission. Also, the analog channels are already spaced too wide apart to minimize interference. This decision was made a few decades ago.

Clearly, with the current advances in technology, it is possible to use all this white space to check what your favorite blogger is doing on twitter :-). Yepp. It can be used as a "wifi on steroids" as Larry put it.

As always with the govt, endless campaigns are needed to convince the FCC. So, go ahead, do yourself a favor and sign a petition.
Free The Air Waves

While you are at it, digg it.

Saturday, July 12, 2008

iPhone 3G Zagg discount

One time use 20% discount codes for Invisible Shield aka Zagg
Invisible shield for iPhone 3G

Works for iPhone 3G invisible shield too.
ch788u
n4ut8y

If you have more, please add them in the comments for others :-)

p.s: I don't work for them, I just love their products...

Thursday, May 29, 2008

Google IO 2008 Day Two

HTML 5 - Gears

From a vendor's perspective: You don't win users by being super standards compliant. At best, you don't lose users. Gears targets developers, not users. So, Gears considers standards to be a profit center rather than a cost center. Gears is a playground for new web APIs. One such example is SQL in Gears, which once proven successful in Gears, was included in HTML5.

Dalvik VM Internals:
Dalvik is designed to run on a slow CPU, with relatively low RAM, and on an OS without swap space while powered by a battery. The system library takes around 10MB. Thats a fairly large library at the application developer's disposal. Memory is saved via minimal repetition, per-type pools, and implicit labeling. Zygote is a pre-initialized process + pre-warmed Dalvik VM. This ensures responsiveness. When requested, it forks and returns the new process. The advantage is that the application shares the API classes in memory so that the memory foot print is less. Android manages multiple processes which have separate heaps, and separate GCs. When an application is installed, it is verified so that it does not violate the constriants of the system. The application is also optimized( static linking, inlining some native methods etc) so that when the application is run, it runs faster. Oh btw: the Dalvik is not a stack based machine but it is a register based machine. Theoretically, it is defined to be an infinite register machine. This results in fewer instructions to perform operations.

Inside the Android Application Framework:

Lifecycle:
Android provides a lot of hooks for application developers to plug-in custom code during the lifecycle. In my APIs, I try to give my developers a powerful yet simple looking APIs. Granted it is a tricky decision to make, but it is all about drawing a line and finalizing on the API. The Android lifecycle hooks are pretty complex and remind you of EJB lifecycle. Hopefully, a lot of developers will not get confused with differences between methods like onCreate(), onStart() and onStop(), onDestroy()

Threads:
Each process has a thread. Each thread has a Looper which handles a message queue. Any events are posted to this message queue. And, are processed by the thread through the Looper. Loopers cannot accommodate multi-threaded access. Loopers support a message handler to handle multi-threaded access. Looper also handles calling local service calls. Local services are ones defined in the same process.

Processes:
Each process is given it's own user ID. The only processes that run as root as Zygot, runtime, and init. A process has Intents, Services, and ContentProviders among other things. A Service is used to expose some functionality to other applications. To expose data, a ContentProvider is used.

Google IO 2008 Day One

Keynote by Vic Gundotra
3 areas that Google is concentrating on are Client, Connectivity, Cloud.

Client:
The client from Google's perspective is the browser. Google Gears is spearheading to improve the browser capabilities. The Gears team is a major contributor in HTML 5.

Alan, Vice President of engineering from myspace showed an interesting usage of Gears by utilizing the local computing power of the PC. When a user with Gears performs a search on myspace, the results are served in realtime by the gears server on the local machine and using it's power and not having to make a trip to myspace's servers. Better experience for the user, lesser load for myspace. This is an interesting usage as Gears was initially designed to solved the disconnected client problem.

Connectivity:
Nice to findout that Webkit browser built into android. And, yes it is embeddable into Android applications. When cell applications and bandwidth of the mobile phone industry mature enough, browser will play a major part in how we deploy and run applications on the mobile. Nice to note that google is working on Gears support for Android. Next was an Android demo. Browser, Maps, Notification, Games, built in Compass et al. Can't wait to get my hands on an Android powered phone !!!

Cloud:
Kevin Gibbs the Tech Lead of App Engine gave a talk about the current capabilities of the App Engine platform (checkout my previous blog about App Engine) and the future direction for the project. One of the features that google is working on is offline processing for AppEngine. The ability to import and export data and to perform batch processing. Oh yes, App Engine now supports memcache. Google released a pricing for AppEngine and oh yes, AppEngine is public now. So, go and sign up if you already haven't. But, no Java/Groovy support yet. I spoke to a few Googlers about Java support for AppEngine. As we all know it is in the works. But, they were pretty tight lipped about it.

GWT: Bruce Johnson - Engineering Manager GWT
For me, being able to use GWT on mobile phones is the ultimate. Development in GWT is quite nice but, some things are not as productive as they could be. For example, having to define an interface for each component, using an custom Interface to access resource bundles. After looking at the buttoned down approach that Rails and Grails take for web development, I think GWT will really fly if it supports a scripting language(jvm based obviously) to develop.

Introduction to Android - Jason Chen
Looks like there are 1.1 billion PC based internet users and over 3 billion mobile users. And, most of the mobile market is fragmented with more than a dozen platforms to develop applications on. Jason discussed the android application stack which you can find here.

Anatomy & Physiology of an Android Application
Linux Kernel: Android built on linux kernel but it is not linux. All android specific functionality is built as drivers that run on top of the kernel. Thanks to the power of microkernel architecture. Android has custom linux drivers like Alarm, Ashmem (shared memory), Binder (IPC), low memory killer, power management, kernel debugger and logger modules.

Binder: Reduces IPC(Inter Process Communication) overhead and manages security issues. Uses shared memory (data between applications is shared). Manages per process thread pool. Performs Reference counting and mapping of object references so that shared objects are tracked and cleaned up. It also supports synchronous calling between applications to enable IPC. When process A calls method foo on service stub, binder driver proxies the object and sends it to the service. Since there is no serialization, there is no overhead that is encountered by a typical IPC.

Power Management:
Mobile devices run on battery power which have limited capacity. So, Android builds on linux's power management to manage power better. It does not replace linux power management. Android uses wake locks (partial, full) which expose the platform' power management to the developer. An Application requests a wake lock, PowerManager makes sure that the device is powered on until the application releases the wakelock. Alternative way of implementing wakeness is userActivity which takes a time period and keeps the application processing part of android alive.

Sitting on the linux kernel are libraries written in C++. Like Bionic (the custom libc implementation) Function libraries like Webkit browser, SQLlite and Native libraries like the SurfaceManager. Applications write to surfaces and the surface manager coalesces the different surfaces and outputs it to the frame buffer (display). Audio Manager works similarly. Audio Flinger routes different audios to different audio devices like speaker, headset, ear piece etc.

Hardware Abstraction Library: These are native libraries that sit above the linux kernel. This layer defines APIs for developers to port android to different hardwares.

Android Physiology:
When the Android platform starts, it starts the linux daemons like USB, Debug Bridge etc. It them starts the Zygote process which is basically a template for future processes. Next, it starts the service manager as the default binder. Overall, this talk rocked.

Under the covers of App Engine Datastore - Ryan Barrett

The AppEngine datastore is powered by BigTable. BigTable is basically a sharded sorted array. Supports sort of test and set operations. They call it single row transactions (read-update)

All entities are stored in a single table called Entities table. Entity keys have a field name:value format(hierarchial). The entities table is ordered by key. So, accessing an entity with a key or iterating on adjacent keys is easy. Although, the only way to re-parent an entity is to delete it and re-create it.

Composite Index:
You define the index yourself in index.yaml Or, when the app runs in the dev area, app engine looks at it's queries and creates indices by default. The flip side of auto-index is if a query has not been hit in dev mode, the prod refuses to handle it, and throws an exception. So, run all ur code paths in dev mode.

Transactions: AppEngine datastore supports atleast read-committed using a last committed timestamp embedded into the data.

Oh yes, there was a 4 hour party at the end of the day. Good music, good food and got to network with Geeks and Higher-ups alike.

Friday, April 11, 2008

The Rise of App Engine

Google released AppEngine, and there were a bunch of reviews comparing it with Amazon EC2. Not quite the same. From a manager's perspective who is trying to get a web project done probably yes, you can achieve the same goal in this case with both...

But, the real difference is in the level at which these are operating. AppEngine, Heroku et al are I would say at a service level. EC2 bascially lets you write what the heck ever code you want, and provides an abstraction at the machine level. Instead of running on the bare metal, you would run on EC2 and get all the goodies that come with the classic "Add another layer of abstraction" rule.

The importance of AppEngine is the productivity gain that it would bring into a very focused but large set of applications. Yes, there is no support for batch and so, lot of other cleanup jobs etc cannot be really run on the platform as of now. But, it is still in it's initial stages and too early to comment. I don't see it ever brimming with additional features though. If you have noticed with Google's products, it is not the feature set that they try to win the market with. Example: Yahoo Mail and Yahoo Messenger kick GMail and GTalk's butt respectively in terms of feature set. But, it is ease of use, attacking the problems that matter (like spam, being able to find what you wrote, and storage size)

Also, I really don't see them ever supporting my current favorite web framework Grails. Why? Grails is tied to hibernate which I don't see running on a non relational database like the one google has as the data store of AppEngine. Not to say they are not working on a "Write your Grails, we port it to something else and run on AppEngine" virtual machine.

There have been free services that ran JSP/ASP based web applications out there for a while. But, those were more of a "write your app, run on our server" kind of model. App Engine is essentially a scalability/availability mantra for the masses. Google has enforced the scalability best practices by restricting the developers to use a subset of features which lend themselves to scaling.

On an un-related note, I could not find a way to get either blogger or feedburner to offer Tag specific feeds. I had to jump through hoops to achieve it. More on that in a later blog. Stay tuned...

Friday, March 28, 2008

TSSJS 2008 Day Three - Synopsis

Session II – eBay Market Place Architecture – Randy Shoup

What happened to Session I ‘s coverage? Lets just blame it on the night before :-)

We all know eBay doesn’t do transactions. Well at least no client side transactions. Their databases run on auto commit. One way they deal with it is by carefully ordering database transactions (example: inserting the slave record and then inserting the master to ensure a consistent master). They also have reconciliary jobs that go through the database and cleanse it periodically.

Strategies for scalability used by eBay
1. Partition
2. Asynchrony
3. Automate Everything
4: Remember Everything Fails

1. Partition

Obviously, they don’t use sessions and they don’t cache business objects (surprisingly). As expected, they use URL rewriting and cookies to track the user. If the data they have to keep about the user is larger than will fit in these two schemes, they use a scratch database. Since they don’t cache business / user related data, they do hammer the databases for all their queries. To handle this situation, they use a custom sharding solution over their ORM and partition their database based on functional divides in the application.

Search: Search queries come to an aggregator which Is actually a scatter-gather (from Enterprise Integration Patterns). This component forwards the search requests to individual nodes which are responsible for indexing and searching just a part of the entire data space. And, then return the results to the aggregator which aggregates (da !!!) and displays them.

2. Asynchrony

The really hard part of massaging systems is guaranteeing once only delivery. If you loosen this restriction, it is a lot easier to scale. They deal with duplicate events by modeling event processing to be idempotent. They deal with out of ordering by making the consumer go to a service that returns the latest state of the event once the consumer receives the event.

3. Automate Everything

Part of that is adaptive configuration: The consumers that dynamically adjusts to meet the SLA by changing parameters like event polling size , number of threads etc. The adaptive configuration also adapts to changes in number of consumer instances.

He gives an example of an adaptive search experience. They have a feedback loop and in an offline way, they analyze it , create a metadata out of it and feed it to the system that uses it to change it’s behavior. Perturbation is the idea that 90% of the time they recommend the optimal. 10% of the time they recommend new options B, C, D etc.. So that if D becomes popular, it will become the dominant recommended value. They also overweigh the negative feedback so that the oscillations are dampened. Pretty slick.

Strategy 4: Remember Everything Fails

Some of the failure patterns used are failure detection, rollback and graceful degradation. Applications log to a message bus and they have listeners that automate the failure detection. It also allows them to detect historical data and it is used from a capacity planning perspective. They get about 1.5TB of log messages every day :-) grep that.

Code rollout/rollback: They have a policy. NO changes to the site that cannot be undone. Each feature has a rollout plan. And, there is a monster rollout plan for the 2 weeks. There is an automated tool that rolls out the dependencies in the reverse dependencies. The automated tool also does rollbacks.

Here is a cool feature. Every feature has a on and off state. It allows them to turn features off rather than redeploying code that lacks that feature. This allows them to deploy features off and then start them later. They are decoupling code deployment from feature deployment. From a developer perspective, they check for feature availability. To blow my own horn a bit, I have built features in the past which can be turned on and off at runtime. I know what you’re thinking (don’t freak out, I don’t) . This is similar to OSGi. Nope not quite. OSGi is about deploying services and controlling their existence. I would say this may be similar from an implementation perspective, but the intent is quite different.

When the resource fails, and it is not critical, it is safely ignored. If a critical service fails, they go to an async mode (and do the processing later) or do failover. When a service does come back up, you don’t want all clients hitting it at once. They have a phased way of letting clients to hit it.

Overall, this talk was very informative. Randy went through a lot of concepts in great detail at a very high pace. I felt there was so much more information left that the talk could have gone for one more hour at the same pace.

Session III – The Busy Java Developer’s Guide to Scala – Ted Neward

A pure functional language has no side effects. But you knew that already. This talk focused on giving an Scala intro to a Java developer. Again, I am not going to cover much of this talk as you can read about Scala yourself.

Lunch Keynote Panel: Patrick Linskey, Ted Neward and others with Eugnee Ciurana as the mike boy ;-)

The conversation went towards the over abundance of frameworks in java. One good point made was, you should never sit to write a framework. You build an application and then extract a framework. In that case, YAGNI will be inherent in that effort. One of the very few web frameworks that was built like that is Rails.

Another point is, it has to be usable before it is reusable. Good one.

In answer to whether the appearance of free type(terse) languages (where syntax does not matter much) java will allow syntax to be optional, it was pointed out that if you try to shoe horn additional features into the language, it will fail under it’s own weight. The main thing in Java is the platform and the APIs and frameworks that are available. Additional languages that run on java platform but support a whole new set of features will see the light.

Session IV: Map Reduce – Why does it Matter – Eugene Ciurana

We looked at Map Reduce and worked out through implementation scenarios in various business domains and the problems or hurdles that we would face in arriving at the solutions using Map Reduce. Audience got to participate and overall, it was a good exercise. (There, I can talk in bullshit business lingo too :-) )

I'm sure you've seen this already, but it is too cool not to point out
http://members.aol.com/matt999h/bullshit.htm

Thursday, March 27, 2008

TSSJS 2008 Day Two - Synopsis

Session I - Concurrency: Past and Present – Brian Goetz

I have heard Brian’s talks in the past in the No Fluff Just Stuff conferences, and they have always been enlightening.

Brian recommends these papers about concurrency:
coping with parallelism treiber 1986
why threads are a bad idea, osterhout 1995
the problem with threads, edward lee – 2006

The talk is a bit more targeted at junior developers than I had hoped. Brian dealt with deadlocks and deadlock avoiding mechanisms. Brian talked about STM (Software Transactional Memory) and why he thinks it is not the silver bullet. He does not mention the reasons. He says he is thinking of presenting them in a talk in JavaOne.

Session II- Performance Puzzlers: Kirk Pepperdine & Brian Goetz

What followed were general commandments of performance (at least at this day and age). Release memory variables soon. Perofrm benchmarks on a dedicated machine.

One interesting point that was made was that in a benchmark with a system using OR mapper and a similar one that didn’t the one with the OR mapper performed better. The reason was that the concurrent garbage collector was essentially stealing processor cycles to clean up all the objects created by the OR mapper and ended up throttling the system against flooding the database with requests. When the OR mapper was missing, the database was thrashing leading to worser performance. So, even though both the direct JDBC implementation and the ORM implementation hit the database with similar number of requests, the ORM implementation is not flooding the database and is providing throttling for the requests. Interesting, but this is one of the things that happens by accident.

Session III- Implementing an ESB solution using Mule – Ross Mason

Services can be categorized as being in one of three layers.
Task based – represents a business task – example: buy a product – little reuse
Entity Based – represents a sub task – example: bill a customer – some reuse
Utility based – represents an atomic independent service – example: credit card processing – most reuse

Reusability of a service usually depends on which layer the service is classified into. With most reuse coming form the lowest. (as expected)

Mule provides support for creating functional test cases which, I think are more useful than testing one single class in isolation. More about unit testing vs functional testing in a later post.

Mule Expression framework is a way to evaluate expressions on a mule message. This makes content based routing easier. The expressions can be in xpath, groovy and a whole lot of other types.

Lunch Keynote: Why the Next Five Years will be about languages – Ted Neward

Ted built a case of language oriented programming. Object oriented programming is not the pinnacle in the evolution of languages. I seriously doubt if we will ever reach the pinnacle :-). One of the required features of a language will be tool support. These days it is a lot easier to get tool support for new languages. And, many of these languages can be made to run on the same platform that we currently run on. In the near future, we will see a lot more languages prop up into our mainstream development. Overall, the keynote was great.

Session IV: You got Your Ruby in my Java – Chris Nelson

Chris gave an introduction of Ruby. This talk is basically a Ruby introduction to the Java developer. I am not going to list the content of the talk as you can google it up and read about Ruby. It is nice that we as an industry are finally breaking the language stalemate that we hit and are willing to look at other languages to improve productivity.

Fire Side Chat: Concurrent Programming with Java and Erlang:

This was the first fireside chat that I’ve been to. The closest I’ve been to a fire side chat is viewing a chat on Google video :-). This free form of discussion with the right participants can really touch on various topics. Something like the discussion on TSS without the flame wars :-).

Fire Side Chat: Mission Critical deployment at Leap Frog Systems - Eugene Ciruana

I have to admit that I don't exactly remember the name of the presentation. But, this is by far the best one I've attended at TSSJS. The crown jewel if you will. To give an analogy, if you've read Patterns of Enterprise Application Architecture - Martin Fowler, it is the embodiment of years (if not decades) of rich software development expertise condensed in a book. This talk was the embodiment of years of practical, i repeat, practical software development experience condensed in a 1 hr talk. This was a fire side chat and not a typical presentation. We the audience, got to ask all sorts of questions about Leap Frog's infrastructure and more importantly, the design decisions that went into selecting the software solutions. It was like discussing the battle plans and reviewing the war with a war general.

No matter how many subject matter experts you talk to, it is the experience in the field and in the trenches that really teaches you and really counts.

Wednesday, March 26, 2008

TSSJS 2008 Day One - Synopsis

Key Note: Neal Ford, Thoughtworks

Neal started with an assault on Ruby of Rails developers(and later justified the reason why ruby on rails developers brag). I think the reason why they brag so much is because it is so productive. Thankfully, on the java side, we have Grails to the rescue. Anybody who hasn’t taken a look at Grails absolutely should.

Next, he builds a case for DSLs. One of the strong points of DSLs is that the context is implicit. You don’t have to keep reminding the runtime of the context:

Example: Consider the following java code

Car car = new Car();
car.setColor(Color.BLACK);
car.setTransmission(Transmission.MANUAL);
car.setPrice(30000);

Now take a look at this in a dsl that I just made up

create Car:car
color:black
transmission:manual
price:30000

Now that is much more clearer as we don’t have to keep ourselves repeating the context which in this case is car.

One thing you have to give to Neal Ford is, he knows how to give presentations. In his presentation, he normally uses just one sentence per slide, Sometimes just one word. And lots of pictures. These pictures are typically not from the computer science domain. These are usually analogies from the real world that help the listener in understanding the underlying concepts. Another technique I noticed him use is when he wants the user to concentrate on his talk and not on the screen, he leaves the screen blank.

Also, a template for your presentation is not needed. Leave it dark. It works magic with the audience. Templates and light colored background distracts the listener from what you want to show. This style of presentation is also visible in Steve Job’s presentation. He never uses a template.

Ok that was an un-intended tangent. Now back to the talk: As a programmer, it adds real value to write programs that are close to how the user talks about the problem. It is a huge advantage if the users can read what the programmer can produce..

The final point was that the development stack of the future would be a structured programming language at the bottom. A dynamic programming language on top of it to improve development productivity and then a DSL on top to stay close to the problem space.

Tactical Design by Glenn Vanderburg:

Presentation tip: Try to build on key aspects of prior presentations (the ones you like) so that the audience get a transitionary feel.

His talk concentrated on how to improve the design skills of poor designers and make them good designers. I think it is obvious that DRY and sticking to one level of abstraction thought the method is a good way to improve design. This is analogous to saying “use Design Patterns”. It is also ironic that he took the same topics and repeated them over and over. Overall, the talk was sub mediocre. Luckily, I had my laptop with me and caught up on some grails reference documentation.

p.s: Lenovo sucks. But having a built in upgraded battery back that gives you 6 hours of alive time rocks.

Also, if ure in a talk that you are not sure of, sit at the last so that you can sneak out if the talk sucks. (Also iterated by Hani in one of the previous years of TSSJS)

Self Scaling Java Based Cloud Architecures – Jinesh Varia, Evangalist at Amazon

I can say I hate talks given by evangelists as most of them tend to be biased. The talk sounded like it was a product presentation and less about giving insight. It takes a visionary to be able to give insight. I thought I can stick around as I wanted to know more about Amazon Web Services. Basically it was a vendor talk, and does not delve into the methodology or mindset that leads to cloud computing.

After 30 minutes, I could not bear it anymore and snuck out.

Next stop: Building REST-ful web services with the JAX-RS API – Mark Hansen

This talk turned into a ‘show and tell’ and I snuck out as soon as I typed the title.

Speed: Kirk Pepperdine

This was the third talk I went to in an hour. You can’t go wrong with a talk about JVM J. The talk covered improvements in garbage collection times and algorithms. One of the cool things that Java 7 (sun jvm) will do is look for local references and allocate them on the register. So that they never see RAM.

After the talk, I caught up to Kirk and asked him a few questions about when you would do volatile (for primitives) vs when you would do AtomicBoolean(and other atomic classes). More on this on a separate blog.

Designing for scalability: Patrick Linskey

This talk was about horizontal vs vertical scalability. Running into contention because of shared state. And, about partitioning the application at various levels to achieve scalability.

An application can be partitioned in multiple ways. Some of which include:

Partitioning along application bottlenecks.

Patritioning along data set “fault lines”. Example: geographic collocation so that only relevant information is close to where it is being used. This is an example of a stateful service being partitioned for scalability.

Also using asynchronous execution to increase scalability. Nothing ground breaking. One good point made was that asynchronous architectures inherently support throttling so that the system can catch up on processing in off peak hours.

Boldly Go where the java language has never gone before – Geert Bevin

This is an interesting frame of mind that you do not need to leave the java language. The language can be instrumented or transformed into a different execution platform so that the developer skills can be reused. Geert gives example of Terracota which performs dynamic byte code manipulation of classes to provide clustering without the developer having to use a vendor specific API.

The next example Geert touches is GWT. GWT lets the developer code in java but generates javascript under the covers in production. I would think the intent of GWT is nice. Free the developer from having to worry about javascript incompatibilities between different browsers. The implementation is based on java, which I increasingly think is a wrong tool to create web based/GUI applications. The Java syntax is too buttoned up. Grails on the other hand is a wonderful tool to create web application. I think a grails implementation of GWT would really shine.

The next example is Android. This is my favorite. Android, as you might already know lets the developer write code in java, but gets compiled into an executable that gets interpreted by the Dalvik virtual machine on the cell phone. Although the developer is coding in java, like in GWT, what gets executed has nothing to do with the JVM. I have not done any serious Android development. My take is that UI development is better left to templates. I think it is just a matter of time that a UI templating framework is released on top of Android.

The scalability pitfalls of the realtime web- Jonas Jacobi & John Fallows

The presenters build a case for a event driven server which is basically a reverse ajax. And started listing the pitfalls and solutions(or hacks depending on your perspective) to these pitfalls. Snoozfest. And yes, I got out as I could.

There was only one other presentation that was remotely interesting, which was a use case on SCA.

Next Generation Payment Systems using SCA

My take on SCA is that it is trying to make the same mistake that EJBs did with the “thy shalt give thy java interface to the client”. I am a big proponent of simple and non language specific contracts between multiple parties in a distributed platform. When you expose a webservice (even REST based ones) you are essentially creating a contract with the end developer about the nature of the data interchange or service invocation. The problem with a language specific contract is that apart from being tied to the language, you now get into headaches with versioning. Most programming languages that we currently have do not work well with versioning. So much so, that these days, we have had to build specifications like OSGi to handle multiple versions of an object existing in the same virtual machine.

Tech Voice