Monday, May 16, 2011

Sentiment Analysis Processing For .NET Solutions

Sentiment Analysis is the process of determining 'tone' within the context of a given text. In today's world of CRM systems, blogging, and social networking has sparked an interest in being able to harness the overall sentiment of that content (positive, negative, neutral, etc). The algorithmic computations and processing required to perform sentiment analysis is based on the field of Natural Language Processing, which it and SA can make someone's career all on thier own. Needless to say it is probably not something a .NET developer wants to attempt from scratch.

The trend tends to be publicly available RESTful APIs that are both free or with a cost that allow a developer to send data to, have SA processing preformed against the submitted data, and have the resulting sentiment information returned.

There are typically (2) main approaches to processing the data to determine sentiment: trained systems using custom data sets (not ADO.NET datasets, actual text data), and out-of-the-box non-trained systems using a default data set. The latter is the simplest approach because you can be up and processing data in a matter of minutes. The downside is much less accurate results (my tests averaged about 60-70% average accuracy across the board using non-trained systems). A trained system using custom data sets with terminology and keywords specific to your need or industry would be the best approach and yield the most accurate results. The downside to this is it is much more time consuming, involved, and will most likely be associated only with paid solutions.

When 1st researching sentiment analysis in relation to .NET applications, I tell you truthfully I knew nothing about it (or what sentiment analysis was even termed) and was looking for an out-of-the-box widget from CodePlex or something in the form of a .dll, etc. Wrong! But I did learn about the different APIs available and composed a list. Below are the names and links of several Sentiment Analysis APIs that could be used for .NET applications, and really for other programming languages too because most are RESTful APIs. (I apologize for the URL data below being an image, but formatting it all or doing it manually was going to be a nightmare)


Table 1.0: Sentiment Analysis (CLICK ON PICTURE FOR LARGER DETAIL)

As I mentioned before this isn't such a straight forward need that I can tell you, "Pick this 1 best sentiment analysis API." You really need to try them out independently as I did and become familiar with each tool and its capabilities. A lot of these tools depending on load can get very expensive. If you are just randomly processing a few blog articles or some basic customer feedback you can probably use any of the solutions for free or close to it. However the minute you want to process the data stored in a CRM system or from a social networking site like Facebook or Twitter, you could be looking at a multi-thousand dollar on going investment. For this reason you should research, become familiar with, and test each API listed to see how it performs.

Lastly I am no expert on this subject and the purpose of this article was really just to create a more organized starting point which I did not have. The following (2) links below will give a decent high level description of both Sentiment Analysis and Natural Language Processing. I also welcome any comments on additional APIs others may have used and the details about them.

Sentiment analysis

Natural language processing

Saturday, April 30, 2011

What is Decompiling an Assembly in .NET and Some Good Tools To Do It

From time to time I like to mix some new information with some basic concepts. I am not always going to compete with the bleeding edge blogs that have full drill down articles 5 minutes after WebMatrix was announced. And what I always remember is not every developer out there is pushing the envelope or is seasoned in this field yet. So I like to cover some basic concepts occasionally to hopefully add a fresh perspective on something more junior devs out there may not understand.

Decompiling. What is it in reference to .NET assemblies? Well in the most abstract form of description, it is the process of reverse engineering MSIL or Microsoft Intermediate Language back into a more familiar, readable set of code to which it was developed. But wait (some might say)! I thought once I compiled my .NET project, the output library was cryptic and unreadable low-level machine code just like my old VB6 or C++ .dlls. Not in our case when doing .NET Managed code development. When you compile a .NET assembly, it is compiled into the afore mentioned intermediate language named MSIL. The MSIL is then compiled to native machine code via a Just-In-Time (JIT) compiler; in our case the .NET Framework JIT compiler is used.

Some of you might remember (or maybe not if you are newer to the industry which is OK), all the buzz around all the different *.NET* languages when the Framework 1.0 was 1st released in late 2001. All this talk about VB.NET, C#, COBOL.NET, Fortran.NET, etc. and everyone scratching their heads going, huh? Well this was because the equation was simpler now:

[Language X] + [.NET Compiler for Language X] = MSIL

So with all of these languages, the goal is to have them compiled to this intermediate language named MSIL. Since MSIL can be interpreted by a JIT compiler and is not unreadable in itself, also presented the opportunity to go the opposite direction and turn the MSIL back into readable code. Hence a 'Decompiler'.

With a decompiler tool, you can open a compiled .NET assembly (i.e. a .dll from the /bin directory) and reverse engineer the MSIL back into more readable code close it its original uncompiled version (some tools are much better at this than others). Why is this useful to me (you might ask)? You will not use it all the time, but imagine if you got a 3rd party .dll without the source code and wanted to see how it worked? Use a decompiler. What if a rogue developer jumped ship and took all the .NET source code with him, only leaving the installed assemblies behind. Use a decomplier. What if you wanted to see how Microsoft builds the very Framework .dlls we use daily (like System.Text, System.NET, etc.; non-meta data versions only)? Use a decompiler.

VS.NET actually shipped with a decompiler tool named ILDASM.exe. This tool was not that bad early on, and I used it for several years. If you ever want to try it out it is located in C:\Program Files\Microsoft Visual Studio [Version]\SDK\v2.0\Bin. I believe the tool was only shipped through VS.NET 2008, but to be honest it was so primitive it really isn’t worth investigating too far, because there are much, much better tools available.

However I wanted to briefly mention (2) very nice *free* tools available for decompiling those compiled .NET assemblies. The 1st is made by a company called RedGate and is named .NET Reflector. Now I mentioned *free*, and there has been a lot of upset devs out there because RedGate announced it was going to be charging $35 once version 7 was available. However, because the community reeled, they made an announcement this past week stating that all version 6 users would get an update with a perpetual free license. You can read about the announcement here: http://eon.businesswire.com/news/eon/20110426007021/en/.NET-Reflector/Reflector/Red-Gate


You can still download version 6 of .NET Reflector, and then have it auto-update itself to the latest version of 6 (6.8 as of this post). Although at $35, version 7 is not a bad price either, but apparently they upset a lot of devs claiming to always make Reflector free, and ended up deciding to charge. Once you have installed .NET Refelctor it will give you the option of integrating itself into any installed version of VS.NET you have on your machine. This gives the developer the option of actually debugging assemblies referenced in your project which is a really nice feature.


At the perfect time a company named Telerik (well known for their 3rd party rich controls and reporting tools for different .NET Technologies) pounced in this week with their own decompiler named "Just Decompile". And as they state on their webpage: "Powerful, Free Decompiling. Forever." Notice the 'Forever'. A nice little dig at RedGate and .NET Reflector I am sure. Anyway, this is a nice decompiler as well, but it is in Beta and not as mature or well working to my experience as Reflector. You know what I say though, "Get both, they are free!!"

To use either decompiler, open the shortcut to which ever tool you want to use; the shortcut probably on the desktop, or you might have to go to installation directory:


Once the decompiler tool is open, they all function similarly. Select "Open" and browse to any compiled .NET .dll. Once opened, you typically navigate the binary on the left, and when selecting a method, property, etc. from the left, you will be able to see the code on the right:

Reflector:



JustCompile:



Now hopefully some things have become clearer on the topics of .NET compiled binaries and how to decompile them. With this understanding, you do not want to place things like passwords, connection string, secret proprietary algorithms, etc into a .NET .dll that will be shipped out or sold. This is because anyone with a little knowledge could decompile your code and see what is inside of it.

One way to complicate the decompiling process is to "Obfuscate" the compiled code. This renames and rearranges the code to make it difficult to decompile and read, but not impossible. There are some good obfuscation tools available like Dotfuscator, but they can get quite pricey for a full version. A better solution to hiding values you do not want seen or hacked, is to place them in your application’s .config file and then use a tool like aspnet_regiis to encrypt them. Nothings bullet proof, but anything is probably better than Dim MySecrectPassword As String = "VeryToughPwd123"

So there isn’t much to loose. Go ahead and download both decompilers and give them a try!

Telerik JustDecompile

(you will have to search and find a freeware site to download Reflector version 6 since RedGate does not offer it for obvious reasons from thier site directly anymore)

Another one to add to the list as well:

JetBrains dotPeek

The 'My' Namespace in a VB.NET Application is Missing Members

OK I ran across this little snag in a VB.NET application, and thought I might share an easy solution to the problem. If you ever have a .NET solution that originated in a previous version other than 2010 and has been upgraded, you might have some difficulty accessing members of the 'My' namespace. Specifically I was missing the auto generated 'MyWebExtensions.vb' code file displayed below:


This resulted in the "'Application' is not a member of 'My'" error:


There are probably a few tricks you could make to the solution (.sln) file by opening it in Notepad and fixing the appropriate settings, but the easiest way to fix the error is to remove the 'My' extension and add it back into the project. To do this open the project's properties of the project where the error is occurring, and click on the "My Extensions" tab.


The 'My' extension you need for your project is dependent on the type of project you have created (Web, WPF, etc.), so make sure to select the proper one when re-adding. First though, right-click the current extension already added and select “Remove Extension”


Then press the "Add Extension" button in the bottom right-hand corner, and add the appropriate extension back into your project. This should auto-generate the 'MyWebExtensions.vb' code file and fix any previously non-accessible members in the 'My' namespace.


For more information on the 'My' namespace, check out the (2) MSDN links below:

My Namespace

How My Depends on Project Type (Visual Basic)

Thursday, April 28, 2011

Honored to Receive the "Microsoft Community Contributor Award"

I received a wonderful email today that I have been recognized with the 2011 "Microsoft Community Contributor Award"! This entry is not to toot my own horn, but rather to say thank you to Microsoft for the recognition. I have used Microsoft products and software almost my entire life, and now am very proud to make my career and hobby in software engineering revolve around Microsoft and the .NET Framework technologies. As I continue to hone and grow my own skills, I will continue to share and contribute that knowledge to help others in our community in whatever way I can. It might be a small token of recognition, but one that I am proud and happy about. Thank you Microsoft.

Tuesday, March 22, 2011

ASP.NET Web Forms vs. ASP.NET MVC

So it has been a few years now since Microsoft introduced the ASP.NET MVC Framework which is Microsoft's implementation of the MVC Architecture for ASP.NET. No the MVC (Model-View-Controller) architecture is not a new concept; just Microsoft's implementation of it in the .NET Framework. There has been a lot of buzz around whether to continue to use ASP.NET web forms which have been around since 2001, or to go with this "hot" new technology in ASP.NET MVC... or something else! There is no 'Silver-Bullet' answer to this, and with most situations, 'it depends'.

Oh I know, you might have "Googled" the exact phrase of this blog post and were hoping to get some answer to exactly which one is better. Nope. I just want to highlight a few of the strengths and weaknesses of both from an abstract point of view, and briefly mention some other alternatives too.

To begin, I am going to repeat a stat that Andrew Duthie (Twitter handle: @devhammer) brought up on this topic at an ASP.NET Firestarter event in Orlando last December. He showed a stat that somewhere in the range of 80%+ dev shops were still using ASP.NET web forms for their bread and butter applications, and that ASP.NET web forms were not going away! Anyone trying to spread a rumor like that needs to check with the folks direct from Redmond 1st, because it simply isn't true. To this end, folks shouldn't feel like they missed the last train to MVC euphoria because everyone hasn't boarded yet. I am not building up an article against MVC; I just want to point out that it is a misnomer that ASP.NET MVC has completely replaced an outdated web forms architecture that some may speak convincingly of on forums and blogs.

So maybe you are leading a team of developers and are looking to decide the pros and cons of MVC, or maybe you are about to build a site for a friend/family member as fast as possible, or involved in a large scale enterprise application that will be web based. Whatever the scenario, you want to know the details of which architecture to choose. After all it will be quite difficult to change between the two once the code begins to flow. Making the right decision up front is important in any of these scenarios no matter how small or large the project.

Let's 1st speak to the advantages of the MVC framework. 1st and foremost in my book is the Architecture itself: M-V-C. It is going to be tough to break this model and rearrange how the application is built. Having your or your team's hand forced into a stable and mature architecture like MVC is a good thing. Raise your hand (like anyone can see...) if you have ever seen or maybe written one of those spaghetti sandwich ASP.NET web forms application that has raw SQL right behind "btnSave_Click()". Blah! Awful nasty, all code-behind the forms, non-scalable mess of an application. So 'point' to MVC for guiding developers to a decent architecture. You can adhere to several good architectures using ASP.NET, but it is up to the developer or team of developers to be disciplined enough to stick to the architecture and not lay back on a fat UI layer. Separation of concerns is a key Object Oriented Concept and a real winner for the MVC architecture. Each layer has its own responsibility, and placing the proper code in its responsible layer will make for a much better code base to scale, maintain, inherit, etc.

Next ASP.NET MVC allows the developer to have "Full Control" over the rendered HTML. For now I am going to leave this as an advantage for MVC, but caution that not everyone fully understands what this means, and it might not be advantageous for a vast majority of applications not needing this type of control. Let me elaborate a bit; that wonderful winter of 2001 when many of our lives took a turn for the better as Scott Guthrie (Twitter handle: @scottgu) came up with what is ASP.NET web forms, and decided while sipping hot cocoa that if you drug a label onto a web form, it would get rendered as a DIV tag. He made this decision (or Microsoft) and not you. You pick the ASP.NET server controls you want to drag onto a web form, but don't have control over what HTML elements actually get rendered on the client as a result. You just know that when a Gridview is dragged onto a form, it will be displayed on the client. You don't know (or need to in many situations) that it might be a series of DIVs and Input HTML controls. This abstraction is actually an advantage for ASP.NET web forms. For many web applications built, we have relied on this drag and drop capability, and trusted the final output that will be rendered to ASP.NET. However, if you have ever found the way the ASP.NET sever controls get rendered is not exactly how you want or need it, then MVC is a better choice as it will allow you to have control over the final rendering. If you do not have any main issues with how ASP.NET web form server controls get rendered in your applications, then this might not be an "advantage" that you can sell to your tech manager or team when trying to build the next project using MVC.

Another advantage of using ASP.NET MVC that ties back into the individual layers is testability, specifically unit testing. Because the application is loosely coupled with code in its separate layers, the ability to create unit tests increases greatly. The event based ASP.NET pages that rely on things like Session, Context, Response, etc are quite difficult to effectively test because of the inability to isolate specific page functionality due to its nature of being spread out and external dependencies as well. MVC improves on this greatly, and is architected to be testable. 'Point' MVC.

So let's examine ASP.NET web forms for a minute. As I mentioned they were born sometime in late 2001, so maturity and a rich tool set (both from Microsoft and 3rd party) is a big advantage. These "tools", or ASP.NET server controls make up a style of development referred to often when speaking of ASP.NET web forms called RAD or Rapid Application Development. "Hey I can drag a few ASP.NET text boxes, some labels, this Gridview thingy, press publish, and WHAT??!! I have a website!" Now that website might be junk, but hey its done already. On to the next project, right? Well not quite. 'Fast', especially in our field doesn't mean better. That site might be done, but as eluded to prior, its design is junk and it will not maintain or scale well. But what if you know the scope is a single, simple page? Well then a tool like web forms is probably perfect, and MVC might be overkill if speed of development is of the essence. Don't be fooled, web forms + data binding, etc. will be faster to generate just based on the drag-and-drop + wizardry capabilities. Just don't have a code review and everything will be fine! So the RAD capabilities of ASP.NET web forms can be a double edged sword. A quality ASP.NET web forms with a pre-determined architecture (3-layer, Domain Driven Design, MVP, etc.) that uses developers with discipline to adhere to it, is a much better choice when creating ASP.NET web form applications.

Along these lines is another advantage with web forms and that is 'Experience'; specifically in relation to a developer's experience with the technology. Again with web forms being a mature technology going on 10 years, there is a solid developer core with ample experience. If you are leading a team of 8-10 developers, you are the only one with MVC experience, and everyone else knows web forms, then this needs to play into your decision of which technology to use. Are you prepared to take the financial impact of formal MVC training through remote seminars, current books, or on-sight training? Can you afford the 1-3 month setback while the team gets up to speed on ASP.NET MVC to say a beginner/intermediate level? If these are not an issue, and the advantages of ASP.NET MVC mentioned previously are present, then it may be the better choice regardless of the upfront cost in time, money, etc. If however, you need to begin writing code immediately with a team of web form rich experience developers and are under a tight deadline, then maybe holding off on using MVC is the better decision. In the end you need to weigh the factors to make the best decision.

There are some other disadvantages to the web forms technology that I will mention briefly here as well. The web form's postback model uses client side generated '_doPostBack()' events for each server callback. If JavaScript is disabled this will cause several problems for the web forms application. It is also difficult to manipulate manually if writing your own client side scripts. MVC improves on this using REST based URLs. The next disadvantage with web forms is ViewState. Referring back to my comments about poorly designed web form applications, a bloated ViewState can cause pages to be larger and slower than they need to be. ViewState stores a base64 encoded string with information about controls state to persist postbacks and to help combat the stateless nature of the web. This however becomes a disadvantage for web forms when abused.

Now let's throw a cog into the wheel and add in another choice for web development in the .NET realm: Silverlight. Oh yeah, and honestly my favorite when it comes to rich UI content. However, you still need a hosting application and Silverlight can be integrated into either ASP.NET web forms or ASP.NET MVC applications. And I don't want to confuse anyone either because Silverlight is not an entire 'web' Framework onto its own and only works alone as individual controls. The Silverlight controls integrate into a web forms or MVC application as just another control and can co-exist with controls of either technology. However, since Silverlight can't invoke the controller class in MVC, or directly postback to the server when using web forms, you will need to use services (WCF) to accomplish this task. Anyone using Silverlight previously knows that WCF services can bridge the gap between the Silverlight control and the server.

Decisions, decisions, which technology to pick? There is a lot to weigh when it comes to these two technologies: ASP.NET web forms or ASP.NET MVC. Are you trying to draw a line in the sand and say, "Only MVC web apps from here on out!!" Personally I would not do that, because it wasn't like 10 years ago when we moved from classic ASP to ASP.NET and were able to be that bold. Now ASP.NET MVC has to be looked at as another tool (a very powerful and cool one) in the proverbial toolbox. There will be a time when MVC is the proper decision, and a time when web forms is the proper one. I strongly urge those picking web forms to at a minimum not fall into the trap of bad or poorly architected code, and be disciplined to use an architecture that is more scalable and maintainable (check my review of the book Professional ASP.NET Design Patterns). And lastly and maybe most importantly, be a 'realist' and not a 'purist' when making this decision. Do what is best and right, and not just which one theoretically is better in argument. This will help you decide. Either way its great to be a .NET developer with such a plethora of technologies to choose from.

Wednesday, March 9, 2011

How To: Encode a JavaScript string in .NET To Escape Characters (i.e. Single Quotes) Automatically

There is a handy new Utility method new to the .NET Framework 4.0 found in the System.Web.HttpUtility namepsace called HttpUtility.JavaScriptStringEncode. It allows you the developer to create any string needed and URL encodes it to be usable by JavaScript.

Ever tried making a Utility method to create a script to be registered with the page in ASP.NET that will popup a JS alert box and takes a message? Ever see what happens when you try and place a single quote in the value like "The value 'x' is not allowed"? The JS will error out on the page with a " Expected ')' " type error message. If you use the utility mentioned above, the encoding is done for you and injects any needed escape characters. The line of code below shows how this might be done in .NET code:

Dim Message As String = "The value 'x' is not allowed"
'URL Encode the message prior to registering with the page
Dim JsAlert As String = "alert('""" & HttpUtility.JavaScriptStringEncode(Message) & """');"
ClientScript.RegisterStartupScript(Page.GetType(), "AlertJsFunction", JsAlert)

Monday, February 28, 2011

Visual Studio Live! is coming up soon (April 18-22) in Las Vegas

If you’ve never been to Visual Studio Live, it offers developers, programmers, software engineers and architects an unbiased blend of practical and immediately-applicable training in Visual Studio, Silverlight, WPF, .NET and more. Plus, there will be 2 new tracks on mobile development and HTML5 this year! Check out the Visual Studio Live! Agenda at http://bit.ly/VSLiveTrks.