Exploring all things software engineering and beyond...

How To: Strip Illegal XML Characters from a String in VB.NET

Recently I was having some trouble with string data that was being sent to an .asmx web service I had built and returning the following exception message:

"Response is not well-formed XML System.Xml.XmlException: ' ', hexadecimal value 0x13, is an invalid character."

The origination of this cause was due to my users copying and pasting data from Microsoft Word into a WYSIWYG editor that was preserving illegal characters, such as the one ('!!') shown in the exception above.

Rather than put in place some calls shielding the web service from the bad data, I decided to research building a method that would strip out and remove illegal characters prior to placing the data into my business object on the front end. Of coarse I could check it on the back end too to be thorough, but this is what was appropriate for my scenario.

There turns out to be some information on this topic, but oddly enough most of the solutions were written for Java and PHP. The .NET solutions I found were only half working and not complete. The best solution I came across was one written in Java at
Ben J. Christensen's Blog. With the help from users on the ASP.NET forums here I was able to place all the information I had found together to come up with a VB.NET version of the code. I really credit Ben and the forum for the base code help; thank you.

The code's purpose is to take the passed in string value, and check each character 1 by 1 to see if any illegal XML characters exist. All valid characters are re-appended to the output, and illegal characters are omitted.

If you need the C# version check the forum link I provided above. The main difference is that the 'AscW' function that wraps the character in focus is not required in C#. This is because C# and VB.NET deal differently in character to integer conversions. The final code is below, and hopefully this .NET version will help somebody in the future as it did for me.


Public Shared Function RemoveIllegalXMLCharacters(ByVal Content As String) As String

'Used to hold the output.
Dim textOut As New StringBuilder()
'Used to reference the current character.
Dim current As Char
'Exit out and ruturn an empty string if nothing was passed in to method
If Content Is Nothing OrElse Content = String.Empty Then
Return String.Empty
End If

'Loop through the lenght of the content (1) character at a time to see if there
'are any illegal characters to be removed:
For i As Integer = 0 To Content.Length - 1
'Reference the current character
current = Content(i)
'Only append back to the StringBuilder valid non-illegal characters
If (AscW(current) = &H9 OrElse AscW(current) = &HA OrElse AscW(current) = &HD) _
OrElse ((AscW(current) >= &H20) AndAlso (AscW(current) <= &HD7FF)) _
OrElse ((AscW(current) >= &HE000) AndAlso (AscW(current) <= &HFFFD)) _
OrElse ((AscW(current) >= &H10000) AndAlso (AscW(current) <= &H10FFFF)) Then
textOut.Append(current)
End If
Next

'Return the screened content with only valid characters
Return textOut.ToString()

End Function

Someone had asked how this method could be modified to accept and return an 'XmlDocument' type. The method only needs a few small code changes to support this, and would make a good overload to the original funtion. You will need to import the System.XML and System.IO namespaces for this overload.


Public Shared Function RemoveIllegalXMLCharacters(ByVal XmlDoc As XmlDocument) As XmlDocument

'Use a StringWriter & XmlTextWriter, to extract the raw text from the XmlDocument passed in:
Dim sw As New StringWriter()
Dim xw As New XmlTextWriter(sw)
XmlDoc.WriteTo(xw)
Dim Content As String = sw.ToString()

'Used to hold the output.
Dim textOut As New StringBuilder()
'Used to reference the current character.
Dim current As Char
'Exit out and ruturn an empty string if nothing was passed in to method
If Content Is Nothing OrElse Content = String.Empty Then
Return Nothing
End If

'Loop through the lenght of the content (1) character at a time to see if there
'are any illegal characters to be removed:
For i As Integer = 0 To Content.Length - 1
'Reference the current character
current = Content(i)
'Only append back to the StringBuilder valid non-illegal characters
If (AscW(current) = &H9 OrElse AscW(current) = &HA OrElse AscW(current) = &HD) _
OrElse ((AscW(current) >= &H20) AndAlso (AscW(current) <= &HD7FF)) _
OrElse ((AscW(current) >= &HE000) AndAlso (AscW(current) <= &HFFFD)) _
OrElse ((AscW(current) >= &H10000) AndAlso (AscW(current) <= &H10FFFF)) Then
textOut.Append(current)
End If
Next

'Build a new XMLDocument to return containing the screened content with only valid characters
Dim XmlDocClean As New XmlDocument
XmlDocClean.LoadXml(textOut.ToString())
Return XmlDocClean

End Function

How To: Use 'Edit and Continue' Debugging Functionality in VS.NET 2008

In my early years of development, I worked a lot with VB6 and VBA in Access. One of the nice parts about developing on those platforms back then was the ability to make changes while debugging, and then continue debugging; hence 'Edit and Continue'. Dare I even say that Access was nice in that if a user had an error display on their screen I could even press [Ctrl] + [Break] at their computer and jump immediately into the code (assuming it was not compiled into a .mde). Ok, I know that is really bad, but back in the day when I was new to development, that made getting to the problem really direct and easy. Ow well, those were the days right?.... NOT! I love .NET!!

Visual Studio .NET has claimed for some time now to have 'Edit and Continue' capabilities. Several years ago, I jumped all over it for my ASP.NET development, but could never get it to actually work. To be honest I gave up on it, and assumed it did not work for ASP.NET development and was for mainly for Win Forms or other types of thick client development.

Well somewhere along the way it has become (or always has been and I didn't configure it properly) functional, and that will save me a lot of time. I can't count how many times I have been 5 levels deep into code and need to make a tiny change. I just got so used to pressing 'Stop' in VS.NET, making the changes, and then starting all over again. It didn't dawn on me to check out the 'Edit and Continue' functionality until something I read recently peaked my interest.

I know a lot of you may read this and say... "You didn't know about this???" Yes I did, but never obviously configured it properly or something else. So this post is more for the seasoned developers or new guys that may have let the 'Edit and Continue' functionality in ASP.NET projects go by the wayside because you could never get it to work either.

It is really simple to implement in VS.NET 2008, and really only involves (2) steps to get the functionality working. The only prerequisite is that you are using a 'Web Application' project type. The 'Website' project type in VS.NET does not have the 'Edit and Continue' functionality and will display a message like: "The source file has changed..." if you attempt to make changes while debugging, and will not allow changes.

If you are using a 'Web Application' project type here are the (2) steps you need to do to get the 'Edit and Continue' functionality to work in an ASP.NET app:

  1. In Tools -> Options -> Debugging -> Edit and Continue, make sure 'Enable Edit and Continue' is selected. In mine I left the default options selected and did not make any further changes.


  2. Double click on 'My Project' in 'Solution Explorer' (or alternatively, right click your project and select 'Properties'). From here select the 'Web' tab. Under 'Servers' make sure 'Use Development Server' is selected (not IIS), and finally, make sure to check the checkbox that states 'Enable Edit and Continue'.




After you have configured the above (2) steps, set a breakpoint on some code. I tried switching some string values and even variable names and then continue debugging, and everything worked!

Again, this post doesn't really highlight anything brand new in VS.NET, but is really just there to help dust something off that some of us may have forgotten about or overlooked.

VSLive! Orlando 2009 Thoughts and Comments

Well I attended the VSLive! conference this week from Sun-Wed (10/4 -> 10/7) in Orlando, FL. This is my second VSLive! conference, as I attended the one in the fall of 2007 in Las Vegas.

My 1st impression of this conference vs. the one from (2) years ago was how much the attendance was down. In Vegas (2) years ago, there must have been 1000 attendees. I think this time I overheard one of the conference organizers state that there were about 350 attendees. I do not attribute any of this to the conference content or presenters, but rather to the economy. The conference this time around was about $1400 with a 'buy one get one free' option a few months ago. Even at this discounted cost (which I believe is down from around the $1900 we paid (2) years ago in Vegas), it is still a lot for companies of all sizes where the 1st item to get cut in budgets is typically training.

One note for those reading this that have never been to a VSLive! conference and are wondering about attending. At $1400 (give or take) to go to the conference, odds are the entire team will not be able to go. In actuality it works out better anyways as I will explain. This conference is probably best for Software Engineers with 5+ years of development experience and a solid knowledge of Microsoft .NET Technologies. The main sessions (not including the pre-conference day long workshops) are 75 minutes each with a 15 minute break in between. 75 Minutes on ASP.NET MVC, WCF, WF, or TFS is obviously not enough to learn the entire topic, but it gives seasoned engineers 'food for thought' on new or existing technologies, which is GREAT and why I enjoy it so much. However, if you have a developer on your team that still thinks a 'Class Object' is an apple (object) on a teacher's desk (class), then this is not for them.

Another comparison I have between the Vegas conference and the Orlando conference was the 'Passport Gaming Lounge'. In Vegas it had black leather couches to watch movies, lots of comfortable seating, and seemed to be really popular because the room was always packed. In Orlando, they used those same uncomfortable conference room chairs with the snacks and drinks constantly being out and waiting for more. It didn't have that same 'oasis' feel that the one in Vegas had. It is a nice perk for these (4) day long conferences.

A traditional event at the VSLive! conference is the 'VSLive! After Dark' event typically held on the 2nd or 3rd day of the conference in the evening for attendees to network and kick back. It also gives another chance to meet in person with the presenters or to ask questions which is great. In Vegas, it was an 'all you can drink' really kick back and have a good time event. The one here in Orlando still had the framework of a nice event, but fell a bit short and attendance was low. I only networked with one other set of developers from Atlanta. I think the downfall may have been the (1) drink coupon we received as compared to the 'all you can drink' from before. And to add on a glass of wine was $9.00... no thank you. I really did enjoy when they got a panel of the presenters to get up on stage and debate (light hardily) 12 questions asked by the audience. I thought the humor was great and was really the highlight of the After Dark event.

One other high point of going to a VSLive conference is the ability to hear Microsoft representatives speak of new and upcoming products in the daily 'Keynote Speaker' session that kicked off the day. This year they announce the 'Microsoft Team Foundation System Basic' package that will soon be available in response to many being intimidated to get into TFS because of the massive install and large learning curve. This excited me because we are still using the archaic VSS, and it would be nice to make small steps to something more modern. TFS Basic should help with that small step.

Now to the meet and potatoes of the event... the actual conference content. This did not fall short by any means and was as informing and interesting as expected. Good Job VSLive! + Presenters!! Some of the presenters seems to be regulars at VSLIve! like Rocky Lhotka, Richard Hale Shaw, Paul Sheriff (scheduled to be there but slipped on his boat in LA and could not attend), and Ken Getz. Then there were some presenters I had not seen previously (may have been there but I had not attended) like Billy Hollis (a.k.a. Bully... not really but anyone that attended the Agile session would understand this), John Papa, Gus Emery, Miguel Castro, and Aaron Skonnard that all were just as impressive.

I tell you, these presenters in my opinion did a fantastic job and I enjoyed all of the sessions! The content provided by a VSLive! conference regardless of all of the little things I wrote at the beginning make it well worth every penny. If I had to pick (1) presenter that was my favorite I would have to say Rocky Lhotka. He is the founder of the
CSLA.NET Framework and instructed the pre-conference session called 'Build Distributed App in .NET 3.5 SP1'. I actually attended this same or closely titled session in Vegas (2) years ago, and Rocky does a great job of keeping it fresh. I would really like to implement his CSLA.NET framework into a new project at some point in the near future; it looks to take care of a lot of the plumbing and 'in between' code needed with OOD.

I also enjoyed listening to Miguel Castro in his Advanced ASP.NET class. He created a framework called 'NavFlow' that wraps up the server side needs for implementing forward and back button functionality in a navigation wizard style web application. Miguel's Blog

Gus Emery gave a great (2) part presentation on ASP.NET MVC that I enjoyed, because although I know of the concepts I have not yet built out a production application using this design. Gus' Blog

Aaron Skonnard gave a wonderful and concise presentation on Cloud Computing that I thought was easy to follow. I felt like I went from almost 0 knowledge on the subject to being really well informed (I can now call a Spade a Spade) in 75 minutes. Aaron's Blog

Lastly, I thought Billy Hollis had a great sense of humor, and a really engaging way of presenting his topic. This guy has a ton of experience in SmartClient and ThickClient application development going way back, and I would like to hear him speak more as I get more into WPF.

One random point here too, it appears as nice as 'Linq to SQL' is and is to use, it does not actively have any main staff at Microsoft with plans to further enhance it or upgrade it. 'Linq' itself is not dead, just Linq to SQL is being phased out. Soon asking questions about it will be like asking about how to use WSE 3.0 in VS.NET 2008. The answer to that is "Quit using asmx and make a WCF service". For Linq to SQL it will be "Quit using it and use the Entity Framework". Better get used to the Entity Framework, which I now understand is much improved in the .NET Framework 4.0. WF (Windows WorkFlow) is also supposed to have much improved functionality in with Framework 4.0 as well. It seems that the early adapters of some of these technologies have gone through some pains, and Microsoft heard them and has now made significant improvements. VS.NET 2010 however needs to get some more improvement before release. I can't tell you how many times it crashed in the different sessions, to the point where VS.NET 2008 was mostly used where applicable.

Overall, I was extremely pleased with this VSLIve! conference and wouldn't think twice about attending another one... I would definitely go again.