New Java 8 features in JNBridgePro 7.2

Monday, Jun. 23rd 2014

Usually, when new versions of Java are released, we at JNBridge don’t have much to do.  The features generally don’t have an impact on what JNBridgePro does, and things just work.  Sometimes, as with variable-length argument lists (introduced in Java 5), they are simply syntactic sugar and are automatically proxied into their underlying form (an array).  Other times, new features occur behind the scenes (for example, lambda expressions in Java 8) and are never exposed to .NET users through proxies, so JNBridgePro doesn’t need to worry about them.

Java 8, however, contains two new features, static interface methods and default interface methods, that do affect JNBridgePro functionality. Both features are designed to improve usability of the language (whether or not they do is debatable, but that’s a topic for another blog post).  Static methods in interfaces are designed to add additional functionality to an interface, in the same way that static field constants have existed in Java interfaces in the past.  Default interface methods are intended to spare users from having to incorporate common implementations of certain interface methods; only unusual implementations need to be supplied.  Both static and default interface methods cause interfaces to act like abstract classes, with greater flexibility.

For JNBridgePro, the problem is that interfaces in most .NET languages (including C#) do not have these features, and attempting to map these features directly during proxy generation will cause exceptions to be thrown, or will cause DLLs to fail verification at run time.  In order to make these new Java 8 features available to .NET developers, we had to make some changes in the way that these new members of Java interfaces are mapped to their .NET proxies.

It was easy to determine how to map static interface methods. We’ve already had to deal with the problem with mapping static interface constants to proxied .NET interfaces, because most .NET languages don’t allow these, either.  For each proxied Java interface, we automatically create an associated helper class (called IConstants, when the original interface is I), which contains the static constants. Starting with JNBridgePro 7.2, static interface methods are also proxied into those same helper classes.

Handling default interface classes is a bit more involved. Not only are C# interfaces not permitted to include actual method implementations, “default” or not, but any C# class implementing an interface must account for every method in the interface, even if a Java class implementing the underlying interface doesn’t need to account for default interface methods.  This means that when we proxy a Java class that implements an interface with default methods, and that uses the default method, the proxied class will fail .NET verification because it doesn’t account for all methods in the implemented interface. To remedy this, starting with JNBridgePro 7.2 if a Java class relies on a default interface method and doesn’t actually provide an implementation of its own, then the .NET proxy will include an implementation of the method, which when called will result in a call to the default method. This should be transparent to the user.

Posted by Wayne | in JNBridgePro, New releases | No Comments »

Java-in-.NET embedding and Java 7 and 8

Wednesday, Jun. 18th 2014

Embedding Java components in .NET applications, when using Java 7 or 8, doesn’t work the same way it previously did with Java 5 or 6, as the focus handling has changed.

When Java components are embedded in .NET applications, and Java 7 or 8 is being used, focus-related events like keyboard events and mouse wheel events are no longer handled properly — they are no longer directed to the appropriate component, but rather are directed to the wrong place and dropped. (Other mouse events, including clicks, which are not focus-related, still function properly.)

Starting with Java 7, the Windows implementation of the AWT focus subsystem was changed.  Previously, every AWT and Swing component was mapped to an underlying Windows component (a “focus proxy”) that handled focus-related events and dispatched them to the appropriate AWT/Swing component. With Java 7 (and continuing into Java 8), the owning frame acts as the focus proxy for all components that are contained within it. Oracle claims that “the mechanism is transparent for a user,” but the change does dramatically affect the behavior of Java AWT and Swing components that are embedded inside Windows Forms and WPF applications. Our research indicates that the AWT focus subsystem is choosing the wrong Windows component as the focus proxy.

We are currently working on a fix to this problem, but we have no estimate on when that fix will be ready. In the meantime, if you are embedding Java components in .NET applications, we recommend using Java 6 for the moment. Note that if your embedded Java component does not depend on focus-related events (for example, it does not take text input or use keyboard shortcuts or respond to mouse wheel events), then you should be able to use Java 7 or 8.

Also note that embedding .NET UI components inside Java applications still works fine as before, whether Java 5, 6, 7, or 8 is being used.

We thank you for your patience while we work on this issue, and we apologize for the inconvenience.

Posted by Wayne | in Java, JNBridgePro | Comments Off

Instantiating Generic Collections

Saturday, May. 17th 2014

We had a question from a user recently asking how to instantiate certain generic collections in Java-to-.NET projects.  Some of the things that were discussed are of interest to the general community.

The user had a .NET method that was proxied to Java.  The method had several parameters, one of which was List<string>, and the other of which was Dictionary<string, List<string>>.  (Both List<> and Dictionary<,> were part of System.Collections.Generic.)  The questions were how to instantiate these collections and add elements to them.

This is actually one of the rare cases where you need to make use of the proxy for System.String. Ordinarily, .NET strings (System.String) are automatically converted to java.lang.Strings, and you never need to use the proxy for System.String, but you do here, since you need to access the .NET type object for System.String from the Java side.

To instantiate List<string>, you first need to make sure you’ve proxied System.String, as well as System.Collections.Generic.List__1 and System.Collections.Generic.Dictionary__2. Then use the code

import System.Collections.Generic.List__1;
import System.String;

List__1 theList = new List__1(String.GetDotNetClass());

Note that String above is the proxy of System.String, because of the import statement. We use that because we need the .NET typeof(string), not the Java String.class.

Note that the proxied List.Add() method has signature void Add(System.Object), so if we want to add a string using the proxied method, we need to do it as follows, using the DotNetString wrapper:

import System.DotNetString;

theList.Add(new DotNetString("a string"));

There are several ways to instantiate Dictionary<string, List<string>>, but they generally involve instantiating List<string> as above, and then getting a .NET Type object from that instantiated list. Sometimes you may need to instantiate such a list just to get the Type object; other times you may have such a list already lying around.  Here’s one way to do it, assuming you’ve instantiated theList as above.  If not, you can just write simple code to instantiate such a list and use that.

import System.Collections.Generic.Dictionary__2;

Dictionary__2 theDictionary = new Dictionary__2(String.GetDotNetClass(), theList.GetType());

Then, you can add key/value pairs to the dictionary as follows:

theDictionary.Add(new DotNetString("the key"), theList);

That’s really all there is to it. If you have any questions, or want to see some other examples, please don’t hesitate to contact us.

Posted by Wayne | in JNBridgePro, Tips and examples | Comments Off

JNBridgePro v7.2, adapters v3.2 released

Tuesday, May. 13th 2014

Today we’ve announced the release of JNBridgePro version 7.2, which supports Java 8 (in addition to still supporting Java 5, 6, and 7). JNBridgePro v7.2 adds support for, among other new Java 8 features, static and default methods in interfaces. In addition, v7.2 includes substantial performance improvements in .NET applications that create very large numbers (in the millions, for example) of instances of a single Java class.

In addition, we’ve announced new version 3.2 releases of our JMS adapters for BizTalk Server and .NET. In addition to using the new JNBridgePro 7.2 components, the new adapters contain enhanced diagnostics to help track down configuration and performance issues.

For more information, please see the announcement. The new releases can be downloaded here.

JNBridgePro and Java 8

Tuesday, Mar. 25th 2014

Java 8 has a couple of new features (particularly, static and default methods in interfaces) that create problems for our current JNBridgePro 7.1. We will be coming out shortly with a new version that supports Java 8 along with previous versions of Java, but if you’re currently using JNBridgePro 7.1 and are having problems with Java 8, contact us and we’ll send you a patch.

Posted by Wayne | in Java, JNBridgePro | Comments Off

Decoding encodings: XML, JMS and the BizTalk Adapter

Monday, Mar. 10th 2014

How does one view an XML document? Notepad, or the XML editor in Visual Studio? XML is text, after all, so theoretically any text editor will do the job. Right?

It’s not really WYSIWYG

Simply put, XML is a binary format. Consider this simple XML document composed in the Visual Studio XML editor:

<?xml version=“1.0″ encoding=“utf-8″?>
<SimpleTest xmlns=“http://www.jnbridge.com/simple/test/schema”>
<degreeSymbol>°</degreeSymbol>
</SimpleTest>

If this file is saved to disk, the encoding is truly UTF-8, including the UTF-8 Byte Order Marker. However, if I compose this same document in a generic text editor and save it to disk, the encoding will most likely be Windows 1252, not UTF-8. Lets say I highlighted and copied the XML from the Visual Studio XML editor to the clipboard. When I paste the XML into another editor and save it to disk, the resulting encoding will not be UTF-8 because the encoding copied and pasted is Windows 1252. My point is that the very first line of the document, <?xml version=”1.0″ encoding=”utf-8″?>, doesn’t necessarily indicate that what you see is what you get.

For the most part, UTF-8 and Windows 1252 encodings are identical if all the characters are single byte. However, the degree symbol, °, is different. In Windows 1252 (and many other encodings), the degree symbol in hex is 0xB0, a single byte. In UTF-8, the degree symbol is multi-byte, 0xC2B0. If the document is composed without regard to the underlying encoding, it’s very easy to end up with an UTF-8 document containing an illegal character. The problem is that just viewing the document in a text editor isn’t going to tell you that. To be absolutely sure, you need a good binary editor that has the ability to convert between encodings according to the specifications behind the encodings.

When encoding meets code

Recently, a customer using the JNBridge JMS Adapter for BizTalk Server ran into some unexpected behavior. The JMS BizTalk adapter was replacing the File Adapter as the customer was moving to a messaging solution centered on JMS. Occasionally, a UTF-8 XML document would contain an illegal character. When the file adapter was used, the XML pipeline would throw an exception during disassembly when the illegal UTF-8 character was encountered. The failed message was routed to a directory where the illegal characters were presumably fixed and the message resubmitted. When the customer moved to the JMS adapter, there were no failed messages, even those containing an illegal character, in this case, the one-byte degree symbol. The final XML document, wherever it was routed to, now contained ‘ ï¿½ ‘ instead.  The byte 0xB0 had been replaced by three bytes: 0xEF, 0xBF and 0xBD. The customer, understandably, was confused.

The problem got its start when the message was published to the JMS queue. At some point, Java code similar to this executed. The code is Java 7.

byte[] rawBytes = Files.readAllBytes(Paths.get(somePath));
String jmsMessageBody = StandardCharsets.UTF_8.decode(ByteBuffer.wrap(rawBytes)).toString();

A UTF-8 XML file is read as raw bytes, then explicitly converted from UTF-8 to a java.lang.String. The string, jmsMessageBody, is used to create a JMS Text Message that will be published to a queue. Though not entirely obvious, the above line of code has just performed a conversion. A Java string uses UTF-16 encoding. During  the conversion, any illegal UTF-8 characters, like 0xB0, are converted to the UTF-16 replacement character, 0xFFFD. This mechanism is part of the Unicode specification.

When the JMS BizTalk adapter receives the JMS Text Message, it must convert the contained text to the expected encoding, UTF-8, before submitting the message to the BizTalk DB. As per the Unicode specification, the UTF-16 replacement character, 0xFFFD, is converted to the UTF-8 replacement characters: 0xEF, 0xBF and 0xBD. When the File Adapter was used, the pipeline threw an exception because the byte, 0xB0, was illegal. When using the JMS adapter, the UTF-8 replacement characters are perfectly legal, hence the pipeline disassembled the document correctly.

Conclusion

The JMS specification says this about JMS Text Messages:

The inclusion of this message type is based on our presumption that String messages will be used extensively. One reason for this is that XML will likely become a popular mechanism for representing the content of JMS messages.

Of course, this was written in 1999, when XML was still used to markup content rather than define it. Should one use Text Messages to carry XML documents as payloads? I believe the answer is ‘no’. If the customer had used JMS Byte Messages instead of Text Messages, the behavior when they switched to JMS would have been the same as with the file adapter. Illegal characters would have been caught by the pipeline and subsequently fixed with the process that was already in place. Instead, the document ends up with the garbage characters ’ ï¿½ ‘  instead of the intended degree symbol.

Using Byte Messages for XML is also more efficient. Not only would there be no unnecessary conversions from UTF-8 to UTF-16 and back again, but there would be no message bloat. Converting UTF-8 to UTF-16 can double the size of the XML document. That’s a 100% increase in size. Why incur the cost to build, publish and deliver JMS messages that are twice the size of the original document?

Posted by William Heinzman | in Adapters, BizTalk Server, Java | Comments Off

Forcing Visual Studio 2013 support for the WCF JMS Adapter

Thursday, Dec. 5th 2013

The JNBridge JMS Adapter for .NET is dependent on the Microsoft Windows Communication Foundation Line-Of-Business Adapter SDK. The SDK is released with versions of BizTalk Server, not Visual Studio. This year, BizTalk 2013 was released in the spring, but Visual Studio 2013 was released this fall, so the current version of the SDK doesn’t support VS 2013.   While it would be easy to tell our customers  interested in integrating JMS into .NET that they must use VS 2012 or 2010, it doesn’t provide much for the customers using VS 2013.

For the new release of the JNBridge JMS Adapter for .NET, the solution was for us to explicitly install and configure the WCF LOB SDK for VS 2013, as part of the adapter installation. This consists of four assemblies and registry entries into the Visual Studio registry hive, HKLM\SOFTWARE\Microsoft\VisualStudio\12. Along with the work inherent in getting the SDK to support VS 2013, JNBridge must also explicitly support the SDK for our end-users.

Couldn’t we just wait until the SDK was refreshed by Microsoft? We could, but the point is that by installing and supporting the SDK now, the customer, JNBridge and Microsoft all benefit.

Posted by William Heinzman | in Adapters, New releases, WCF JMS Adapter | Comments Off

Announcing JNBridgePro 7.1 and versions 3.1 of the JMS Adapters for .NET and BizTalk Server

Thursday, Dec. 5th 2013

JNBridgePro version 7.1 and versions 3.1 of the JMS Adapters for .NET and for BizTalk Server are released!

JNBridgePro now supports Visual Studio 2013, and completes the “any-CPU” feature to include specifing separate 32-bit and 64-bit JVMs in a single shared-memory application.

The JMS Adapters for .NET and for BizTalk now provide support for the JMS 2.0 specification, and a new unified installer for both 32 and 64-bit.

Download a full-featured 30-day trial today from www.jnbridge.com/downloads.htm.

Posted by JNBridge | in Adapters, Announcements, JNBridgePro, New releases | Comments Off

Creating a .NET-based Visual Monitoring System for Hadoop

Monday, Oct. 21st 2013

Summary

Generic Hadoop doesn’t provide any out-of-the-box visual monitoring systems that report on the status of all the nodes in a Hadoop cluster. This JNBridge Lab demonstrates how to create a .NET-based monitoring application that utilizes an existing Microsoft Windows product to provide a snapshot of the entire Hadoop cluster in real time.

Download the source code for this lab here, get a PDF of this Lab here. Click the image below to see a .GIF preview of the app in action.

visualizer

Introduction

One of the ideals of distributed computing is to have a cluster of machines that is utterly self-monitoring, self-healing, and self-sustaining. If something goes wrong, the system reports it, attempts to repair the problem, and, if nothing works, reports the problem to the administrator — all without causing any tasks to fail. Distributed systems like Hadoop are so popular in part because they approach these ideals to an extent that few systems have before. Hadoop in particular is expandable, redundant (albeit not in a terribly fine-grained manner), easy-to-use, and reliable. That said, it isn’t perfect, and any Hadoop administrator knows the importance of additional monitoring to maintaining a reliable cluster.

Part of that monitoring comes from Hadoop’s built-in webservers. These have direct access to the internals of the cluster and can tell you what jobs are running, what files are in the distributed system, and various other bits of important information, albeit in a somewhat obtuse spreadsheet format. A number of Hadoop packages also come with generic distributed system monitoring apps such as Ganglia, but these aren’t integrated with Hadoop itself. Finally, there are products like Apache Ambari from Hortonworks that do a great deal of visual monitoring, but are tied to particular companies’ versions of Hadoop. In this lab we will look at the basics of producing a custom app that is integrated into the fabric of your own Hadoop cluster. In particular, being JNBridge, we are interested in building a .NET-based monitoring app that interfaces over a TCP connection with Java-based Hadoop using JNBridgePro. To expedite the process of creating a GUI for our monitoring app, we will use Microsoft Visio to easily create a visual model of the Hadoop cluster. This way we can create a rudimentary monitoring app that works as a cluster visualizer as well.

The app that we’re aiming to create for this lab is fairly simple. It will present a graph-like view of the logical topology of the cluster where each worker node displays its status (OK or not OK), the amount of local HDFS space used up, and the portion of  Mappers and Reducers that are in use. We’re not looking for hard numbers here — that information is attainable through the webservers — rather, our goal is to create a schematic that can be used to quickly determine the status of various components of the cluster.

Before we begin, please bear in mind two things: 1. We are not proposing our solution or our code as an actual option for monitoring your Hadoop cluster. We are simply proposing certain tools that can be used in the production of your own monitoring app. 2. Our solution was produced for Hortonworks’ HDP 1.3 distribution of Hadoop 1.2.

Even in our limited testing we noticed a distinct lack of portability between different Hadoop distributions and versions — particularly where directory locations and shell-script specifications are concerned. Hopefully our explanations are clear enough that you can adjust to the needs of your own cluster, but that might not always be the case. We’re also going to assume a passing familiarity with Hadoop and Visio, since explaining either system and its internal logic in great detail would make this lab much longer than need be.

What You’ll Need

  1. Apache Hadoop (we used the Hortonworks distribution, though any will work with some effort)
  2. Visual Studio 2012
  3. Microsoft Visio 2013
  4. Visio 2013 SDK
  5. JNBridgePro 7.0

Digging into Hadoop

To begin, in order to get as complete information about the cluster as possible, we need to get hold of the NameNode and JobTracker objects — which manage the HDFS and MapReduce portions of Hadoop respectively — that are currently running on the cluster. This will expose the rich APIs of both the JobTracker and the NameNode as well the individual Nodes of the cluster. It’s these APIs that the JSP code uses to create the built-in webserver pages and provides more than enough information for our purposes.

However, accessing these objects directly is somewhat difficult. By and large, Hadoop is built so the end user can only interface with the cluster via particular sockets that only meter certain information about the cluster out and only allow certain information in. Thus getting direct access to and using the APIs of the running NameNode and JobTracker isn’t something that you’re supposed to be able to do. This is a sensible safety precaution, but it makes getting the kind of information required for a monitoring app somewhat complicated. Granted, there is the org.apache.hadoop.mapred.ClusterStatus class that passes status information over the network, but the information it provides isn’t enough to create a truly robust monitoring app. Our solution to this dilemma involves a lightweight hack of Hadoop itself. Don’t worry, you’re not going to need to recompile source code, but some knowledge of that source code and the shell scripts used to run it would be helpful.

Our goal is to wedge ourselves between the scripts that run Hadoop and the process of actually instancing the NameNode and JobTracker. In so doing, we can write a program that breaks through the walled garden and allows us to serve up those objects to the .NET side directly. Technically a similar process could be used to code a similar monitoring app in pure Java, but that’s not what we’re interested in here. If things still seem a little fuzzy, hopefully you’ll get a better idea of our solution as we explain it.

When the $HADOOP_INSTALL/hadoop/bin/hadoop script is called to start the NameNode and JobTracker, it simply runs NameNode.main() and JobTracker.main(). These main functions, in turn, call just a handful of lines of code to start the two master nodes. Note that this process is usually further obfuscated by a startup script such as start-all.sh or, in our case with Hortonworks, hadoop-daemon.sh, but they all ultimately call the same $HADOOP_INSTALL/hadoop/bin/hadoop script. In our solution, instead of having the script call NameNode.main() and JobTracker.main(), we instead call the main functions of our own wrapper classes that contain the code from the original main functions in addition to setting up the remote Java-side servers of JNBridgePro. These wrapper classes are then able to serve up the JobTracker and NameNode instances to our Windows-based monitoring app.

The JobTracker wrapper class looks like this:

import java.io.IOException;
import java.util.Properties;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.JobTracker;
import com.jnbridge.jnbcore.server.ServerException;

public class JnbpJobTrackerWrapper {

	private static JobTracker theJobTracker = null;

	public static void main(String[] args) {

		Properties props = new Properties();
		props.put("javaSide.serverType", "tcp");
		props.put("javaSide.port", "8085");
		try {
			com.jnbridge.jnbcore.JNBMain.start(props);
		} catch (ServerException e) {

			e.printStackTrace();
		}

		try {
			theJobTracker = JobTracker.startTracker(new JobConf());
			theJobTracker.offerService();
		} catch (Throwable e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
	}

	public static JobTracker getJobTracker()
	{
		return theJobTracker;
	}
}

And the NameNode wrapper class looks like this:

import java.util.Properties;
import org.apache.hadoop.hdfs.server.namenode.NameNode;
import com.jnbridge.jnbcore.server.ServerException;

public class JnbpNameNodeWrapper {

	private static NameNode theNameNode = null;

	public static void main(String[] args) {

		Properties props = new Properties();
		props.put("javaSide.serverType", "tcp");
		props.put("javaSide.port", "8087");
		try {
			com.jnbridge.jnbcore.JNBMain.start(props);
		} catch (ServerException e) {

			e.printStackTrace();
		}

		try {
			theNameNode = NameNode.createNameNode(args, null);
			if (theNameNode != null)
			{
				theNameNode.join();
			}
		} catch (Throwable e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
	}

	public static NameNode getNameNode()
	{
		return theNameNode;
	}
}

To have the $HADOOP_INSTALL/hadoop/bin/hadoop script call our classes instead, we alter the following lines of code:

elif [ "$COMMAND" = "jobtracker" ] ; then
  #CLASS=org.apache.hadoop.mapred.JobTracker
  CLASS=com.jnbridge.labs.visio.JnbpJobTrackerWrapper
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_JOBTRACKER_OPTS"

and

elif [ "$COMMAND" = "namenode" ] ; then
#CLASS=org.apache.hadoop.hdfs.server.namenode.NameNode
CLASS=com.jnbridge.labs.visio.JnbpNameNodeWrapper
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_NAMENODE_OPTS"

The replacement lines are right below the commented-out original lines.

In order to finish up the Java side of this solution, we need to add our wrapper classes as well as the JNBridge .jars to the Hadoop classpath. In our case that meant simply adding the wrapper class .jars along with jnbcore.jar and bcel-5.1-jnbridge.jar to the $HADOOP_INSTALL/hadoop/lib directory. Since the Hadoop startup scripts automatically include that directory as part of the Java classpath, we don’t need to do anything else. The startup scripts that came with the Hortonworks distribution work exactly as they did before, and the cluster starts up without a hitch. Only now, our master nodes are listening for method calls from the .NET side of our monitoring app.

Monitoring on the Windows Side

To begin building the Java proxies, we need to add the .jars that contain the appropriate classes to the JNBridge classpath. Below is a screenshot from the JNBridge Visual Studio plugin of the .jars we added to the classpath used to create the proxies. Note that this includes not just the requisite JNBridge .jars and the Hadoop .jars (scraped from the machines in our cluster), but also our wrapper class .jars as well (here called proto1.jar).

classpath

From here we need to actually pick those classes that need to be proxied so that they can be called within our C# code natively. The easiest way to do this is simply select the two wrapper classes (JnbpJobTrackerWrapper and JnbpNameNodeWrapper) in the left window of the JNBridge Visual Studio proxy tool and click the Add+ button. JNBridge will take care of the rest automatically.

classes

Now we can build the monitoring app itself. When beginning your new project, make sure to add the correct references. You need to reference the correct JNBShare.dll for your version of .NET, the .dll created by the JNBridge proxy process you performed earlier, and the Microsoft.Office.Interop.Visio.dll for your version of Microsoft Visio. We used Visio 2013 for this project along with its SDK (which is a newer version than what came with Visual Studio 2012). Also, be sure to add the JNBridge license .dll to your classpath. Here’s what our references looked like (note that HadoopVis is the name we gave to our Java proxy .dll):

references

The overall flow of our code is fairly simple: get the JobTracker and NameNode objects, use them to build an internal representation of the cluster, draw that cluster in Visio, and update that drawing with the latest information about the cluster.

In order to get the JobTracker and NameNode objects, we need to connect to the wrapper objects running on our cluster. We do this as follows:

// Connect to the two Java sides
// The jobTracker side is automatically named "default"
JNBRemotingConfiguration.specifyRemotingConfiguration(JavaScheme.binary,
        "obiwan.local", 8085);
// We name the nameNode side "NameNode"
JavaSides.addJavaServer("NameNode", JavaScheme.binary, "obiwan.local", 8087);

Note that we are connecting to two Java sides here even though they are both located on the same physical machine (obiwan.local on our cluster). In JNBridgePro, the system needs to know which Java side to communicate with in order to function properly. If you have an object that exists on one remote JVM and you try to call one of its methods while your code is pointing at a different JVM, your program will crash. To manage this, use the JavaSides.setJavaServer() to point to the correct JVM. You’ll see this sprinkled throughout our code as we switch between pointing to the JobTracker and the NameNode JVMs.

Once we’ve connected to the Java side, we just need to get the objects and build our internal representation of the cluster. The overall program flow looks like this:

JobTracker jobtracker = JnbpJobTrackerWrapper.getJobTracker();
java.util.Collection temp = jobtracker.activeTaskTrackers();
java.lang.Object[] tts = temp.toArray();
JavaSides.setJavaServer("NameNode");
NameNode namenode = JnbpNameNodeWrapper.getNameNode();
HadoopTree.buildTree(jobtracker, "obiwan.local", "obiwan.local", tts,
        namenode.getDatanodeReport(FSConstants.DatanodeReportType.ALL));
// closedFlag is True if a user closes the Visio window in use.
while (!closedFlag)
{
    JavaSides.setJavaServer("NameNode");
    DatanodeInfo[] dnReport = namenode.getDatanodeReport(
            FSConstants.DatanodeReportType.ALL);
    JavaSides.setJavaServer("default");
    HadoopTree.updateTree(jobtracker.activeTaskTrackers().toArray(),
            jobtracker.blacklistedTaskTrackers().toArray(), dnReport);
    System.Threading.Thread.Sleep(3000);
}

The buildTree() and updateTree() methods build and update the internal representation of the cluster and hold it within the HadoopTree class. These methods also invoke the VisioDrawer class that, in turn, draws that internal representation to Visio. We’re not going to go into detail here about how the HadoopTree class builds our internal representation of the cluster. The simple tree-building algorithm we use isn’t terribly pertinent to our current discussion, but we encourage you to look at our code especially if you’re curious about what methods we use to extract information from the JobTracker and NameNode objects (though a few of those methods can be seen in the above code snippet). Keep in mind there are a number of ways to pull information about the cluster from these to objects and we encourage you to explore the published APIs to figure out how to get the information you want for your app. On a side note, the API for the NameNode isn’t currently published as part of the official Hadoop API, so you’ll have to go back to the source code to figure out what methods to call. The API for the NameNode is considerably different from that for the JobTracker too, so don’t expect similar functionality between the two.

Drawing Everything in Visio

Once we have an internal representation of the cluster, we need to draw it in Visio to complete our rudimentary monitoring/visualization app. We begin by opening a new instance of Visio, creating a new Document, and adding a new Page:

// Start application and open new document
VisioApp = new Application();
VisioApp.BeforeDocumentClose += new EApplication_BeforeDocumentCloseEventHandler(quit);
ActiveWindow = VisioApp.ActiveWindow;
ActiveDoc = VisioApp.Documents.Add("");
ActivePage = ActiveDoc.Pages.Add();
ActivePage.AutoSize = true;

We then open our custom network template (which we’ve included with the code for your use) and pull out all the Masters we need to draw our diagram of the Hadoop cluster:

// Open visual templates
networkTemplate = VisioApp.Documents.OpenEx(@"$Template_Directory\NetTemplate.vstx",
        (short)VisOpenSaveArgs.visOpenHidden);
Document pcStencil = VisioApp.Documents["COMPME_M.VSSX"];
Document networkStencil = VisioApp.Documents["PERIME_M.VSSX"];
Shape PageInfo = ActivePage.PageSheet;
PageInfo.get_CellsSRC((short)VisSectionIndices.visSectionObject,
        (short)VisRowIndices.visRowPageLayout,
        (short)(VisCellIndices.visPLOPlaceStyle)).set_Result(VisUnitCodes.visPageUnits,
                (double)VisCellVals.visPLOPlaceTopToBottom);
PageInfo.get_CellsSRC((short)VisSectionIndices.visSectionObject,
        (short)VisRowIndices.visRowPageLayout,
        (short)(VisCellIndices.visPLORouteStyle)).set_Result(VisUnitCodes.visPageUnits,
                (double)VisCellVals.visLORouteFlowchartNS);
ActivePage.SetTheme("Whisp");

// Get all the master shapes
masterNode = pcStencil.Masters.get_ItemU("PC");
slaveNode = networkStencil.Masters.get_ItemU("Server");
rack = networkStencil.Masters.get_ItemU("Switch");
dynamicConnector = networkStencil.Masters.get_ItemU("Dynamic connector");

// Open data visualization template and shape
slaveBase = VisioApp.Documents.OpenEx(@"$Template_Directory\DataGraphicTemplate.vsdx",
        (short)Microsoft.Office.Interop.Visio.VisOpenSaveArgs.visOpenHidden);
slaveDataMaster = slaveBase.Pages[1].Shapes[1].DataGraphic;

There are two important things in this snippet that don’t crop up in a lot of examples using Visio. First are these two statements:

PageInfo.get_CellsSRC((short)VisSectionIndices.visSectionObject,
        (short)VisRowIndices.visRowPageLayout,
        (short)(VisCellIndices.visPLOPlaceStyle)).set_Result(VisUnitCodes.visPageUnits,
                (double)VisCellVals.visPLOPlaceTopToBottom);
PageInfo.get_CellsSRC((short)VisSectionIndices.visSectionObject,
        (short)VisRowIndices.visRowPageLayout,
        (short)(VisCellIndices.visPLORouteStyle)).set_Result(VisUnitCodes.visPageUnits,
                (double)VisCellVals.visLORouteFlowchartNS);

These statements tell Visio how to lay out the diagram as it’s being drawn. The first statement tells Visio to create the drawing from top-to-bottom (with the master nodes on top and the slave nodes on the bottom) while the second tells Visio to arrange everything in a flowchart-style pattern (we found this to be the most logical view of the cluster). Logically what we’re doing is editing two values in the current Page’s Shapesheet that Visio refers to when making layout decisions for that Page.

The second thing we want to draw your attention to are these lines:

slaveBase = VisioApp.Documents.OpenEx(@"$Template_Directory\DataGraphicTemplate.vsdx",
        (short)Microsoft.Office.Interop.Visio.VisOpenSaveArgs.visOpenHidden);
slaveDataMaster = slaveBase.Pages[1].Shapes[1].DataGraphic;

This code opens a prior Visio project (that we’ve also included with our code) where we’ve simply tied a series of DataGraphics to a single Shape. These DataGraphics can then be scraped from the old project and tied to Shapes in our new project. Our prefabricated DataGraphics are used to display information about individual nodes in the cluster including HDFS space, Mappers/Reducers in use, and overall status of the TaskTracker and DataNode. We have to create these DataGraphics ahead of time since they can’t be created programmatically.

We can then draw the cluster on the Page that we’ve created. Again, we are going to skip over this portion of the process since it is largely standard Visio code. The cluster representation is drawn mostly using the Page.DropConnected() method, and since we’ve already told Visio how to format the drawing, we don’t need to mess with its layout too much. All we have to do is call Page.Layout() once all the Shapes have been drawn to make sure everything is aligned correctly.

The last interesting bit we want to touch on is updating the drawing with the most recent data from the cluster. First we need to get the latest data from the cluster and update our internal representation of the cluster:

public static void updateTree(Object[] taskNodeInfo, Object[] deadTaskNodes,
        DatanodeInfo[] dataNodeInfo)
{
    JavaSides.setJavaServer("NameNode");
    foreach (DatanodeInfo dn in dataNodeInfo)
    {
        HadoopNode curNode;
        leaves.TryGetValue(dn.getHostName(), out curNode);
        if (dn.isDecommissioned())
        {
            curNode.dataActive = false;
        }
        else
        {
            curNode.setHDSpace(dn.getRemainingPercent());
            curNode.dataActive = true;
        }
    }
    JavaSides.setJavaServer("default");
    foreach (TaskTrackerStatus tt in taskNodeInfo)
    {
        HadoopNode curNode;
        leaves.TryGetValue(tt.getHost(), out curNode);
        curNode.setMapUse(tt.getMaxMapSlots(), tt.countOccupiedMapSlots());
        curNode.setReduceUse(tt.getMaxReduceSlots(), tt.countOccupiedReduceSlots());
        curNode.taskActive = true;
    }
    foreach (TaskTrackerStatus tt in deadTaskNodes)
    {
        HadoopNode curNode;
        leaves.TryGetValue(tt.getHost(), out curNode);
        curNode.taskActive = false;
    }
    VisioDrawer.updateData(leaves);
}

Once the data has been gathered, the Visio drawing is updated:

public static void updateData(Dictionary<string, HadoopNode> leaves)
{
    foreach (KeyValuePair<string, HadoopNode> l in leaves)
    {
        HadoopNode leaf = l.Value;
        Shape leafShape = leaf.getShape();
        // Update HDFS information
        if (leaf.dataActive)
        {
            leafShape.get_CellsSRC(243, 20, 0).set_Result(0, leaf.getHDSpace());
            leafShape.get_CellsSRC(243, 0, 0).set_Result(0, 1);
        }
        // If the DataNode has failed, turn the bottom checkmark to a red X
        else
        {
            leafShape.get_CellsSRC(243, 0, 0).set_Result(0, 0);
        }
        // Update mapred information
        if (leaf.taskActive)
        {
            leafShape.get_CellsSRC(243, 17, 0).set_Result(0, leaf.getMapUse());
            leafShape.get_CellsSRC(243, 18, 0).set_Result(0, leaf.getReduceUse());
            leafShape.get_CellsSRC(243, 12, 0).set_Result(0, 1);
        }
        // If the Tasktracker has failed, turn the bottom checkmark to a red X
        else
        {
            leafShape.get_CellsSRC(243, 12, 0).set_Result(0, 0);
        }
    }
}

Logically we are just changing certain values in each Shapes’ Shapesheet that are tied to the DataGraphics we added earlier. Which cells of the Shapesheet correspond to which DataGraphic had to be decided in advance when we created the DataGraphic by hand. This way we can address those indices directly in our code.

This updating process (as you saw in an earlier code segment) is done in a simple while loop polling system that updates every three seconds. We used this method rather than a callback/event handling strategy largely for ease of implementation. The NameNode and JobTracker classes don’t implement a listener interface for notifying when values change. As a result, in order to add this functionality, we would have to do significantly more Hadoop hacking than we’ve already done. We could also implement an asynchronous update system in pure C# that would use events to notify the graphic to update, but that would still require polling the Java side for changes somewhere within our program flow. Such a system would lighten the load on Visio by decreasing the number of times we draw to the Page, but wouldn’t increase efficiency overall. While both ways of implementing callbacks are interesting exercises, they’re somewhat outside the scope of this lab.

The Result

For our small, four-virtual-machine cluster, this is the result (as you saw above):

visualizer

Here a Map/Reduce job is running such that 100% of the Mappers are in use and none of the Reducers are being used yet. Also, notice that the middle worker node has used up almost all of its local HDFS space. That should probably be addressed.

For larger, enterprise-size clusters Visio will likely become an even less viable option for handling the visualization, but for our proof-of-concept purposes it works just fine. For larger clusters, building a visualizer with WPF would probably be the better answer for a .NET-based solution.

We hope this lab has been a springboard for your own ideas related to creating Hadoop monitoring/visualization applications.

~Ian Heinzman

Posted by ianheinzman | in Labs | 1 Comment »

Build 2013 Impressions

Wednesday, Jul. 24th 2013

I recently came back from Microsoft’s Build 2013 conference in San Francisco, where Microsoft’s latest technologies are introduced to developers.  Much of the conference was devoted to technology related to Metro/Modern Windows/Windows Store apps, and also to Windows Phone, neither of which are relevant to JNBridge’s mission.  However, there were a few things that caught my eye, and which are worth mentioning.

First, Windows 8.1 (formerly, Windows Blue) was unveiled.  It’s definitely an improvement for desktop users.  We currently run Windows 7 on our development machines and do our Windows 8 tests on virtual machines.  While nothing I saw at Build regarding Windows 8.1 will change that, it’s definitely getting closer to being a system that I would feel comfortable using every day.

Visual Studio 2013 was introduced during the Build keynote, and I was quite impressed with the new code navigation features, including the enhanced scrollbars and the look-ahead capability that allows developers looking at a method call to open a mini-window containing the text of the called method.  I can certainly see using these features in our future development.

It appears that the JNBridgePro Visual Studio plugin should work just fine with VS 2013, although we realize that this is a very early version of the software, and things could change.  We will certainly be tracking that.  Similarly, the new version of the .NET Framework that will be released with VS 2013 seems to run JNBridgePro just fine.

Finally, given our interest in interoperability issues relating to Hadoop, we were intrigued to see this talk on Azure HDInsight.  We’ve been thinking of new ways in which JNBridgePro can be integrated with Big Data applications, and the talk suggested some scenarios that we’ll be working on in the coming months.

Were you at Build, and, if so, did you see anything interesting related to interoperability?  Let us know.

Posted by Wayne | in Events, General | Comments Off