The .NET Uri Class and the Cambia.UriExtensions Nuget Package
By
Steve on
Wednesday, September 14, 2016
Updated
Saturday, September 24, 2016
Viewed
73,117 times. (
0 times today.)
The .NET Uri class is powerful in a number of scenarios, but can be unweildy at times. I've created a set of extensions to the Uri class (and tested them thoroughly) to make a range of interactions with the Uri class downright fun. These improvements include
- Never having to worry about URI delimiters again :/@:/?&#.
- Fluent (chainable) methods for dropping and setting parts of a Uri.
- Representing URLs in commonly needed formats: absolute, virtual, app relative, scheme relative
- Extracting, setting, replacing and appending query items.
The Uri class in .NET, like similar classes in other frameworks, does a lot of heavy lifting.
If you've ever spent time digging into the specifications on URIs and the many different behaviors for different schemes then you know you don't want to spend any more time on this than necessary.
I decided to encapsulate a number of helpful methods from my own code libraries into a well-tested, robust package for public consumption. What I thought would take a couple days became a much longer task, but the result should save others and my future self a lot of time and headache.
For a short introduction on the structure of URIs see my article covering the anatomy of a URI.
Use of Uri Can Be Tedious
One of the biggest annoyances when using the Uri class is dealing with the various delimiting characters.
http://www.domain.com/path1/path2?queryitem=value&q2=value2#fragment
In a typical URL, a colon follows the scheme (http:), slashes delimit the scheme from the authority (//), a question mark (?) begins the query string and hash (#) denotes the fragment. And more...
The Uri class is a bit inconsistent regarding delimiters.
For example, the Uri.Query property contains the leading question mark:
?queryitem=value&q2=value,
but the scheme does not include the trailing colon
http.
Nor does the authority include the leading double slashes
www.domain.com
but the path does include a leading slash
/path1/path2,
and the fragment includes the leading hash:
#fragment.
I'm not saying the Uri class is broken, it's not, I've tested it a lot and it's pretty solid...just annoying.
There are ways on the Uri class to get these components with and without delimiters, but those are a bit cumbersome and inconsistent, too.
Because of these inconsistencies, an overarching goal of the Cambia.UriExtensions package is to make futzing with delimiters a thing of the past.
All you care about is the data in the URI. You'd rather just trust the class to parse and join the parts correctly.
I won't go into further detail on Uri quirks. It's beyond the scope of this article. Just take my word for it. I've spent the last week testing all the nuances of Uri and UriBuilder while sifting through RFC3986, the definitive spec on URIs.
Most of the Uri quirks will never affect you, but if you want a robust Uri extensions package, well, someone has to think about this stuff. (Lucky me.)
GetPart Method Intro
The Uri class has a method called GetComponents. Why am I providing a method, GetPart, which does essentially the same thing?
Well, yes, it is essentially the same, but with a couple of differences:
- No leading or trailing delimiters.
- Some additional parts.
The GetPart method takes a UriPart enum as its argument so let's have a look at the different parts you can extract using the GetPart method.
using Cambia;
public enum UriPart
{
None = 0,
Scheme,
Authority,
AuthorityWithUserInfo,
SchemeAndAuthority,
SchemeAndAuthorityWithUserInfo,
UserInfo,
User,
Password,
Host,
Port,
HostAndPort,
Path,
Query,
PathAndQuery,
Fragment,
PathQueryAndFragment,
}
As I said, when returned from the GetPart method, none of these parts will have leading or trailing delimiters. That even includes the Path. It will not begin with a slash.
The .NET framework also contains an enumeration of Uri parts, but it's called UriComponents. There are a couple of parts in my UriPart enumeration that are not present in UriComponents.
Specifically, AuthorityWithUserInfo and SchemeAndAuthorityWithUserInfo are new concepts.
According to RFC3986, the Authority actually contains three sections: user info, host and port. However, Uri.Authority only contains the host and port.
Consider the following URI.
ftp://steve@host.com:555/path
The above ftp address is valid and you can see that the host is preceded by steve@. That's user information. The Authority should contain that portion, but you'll find in the Uri.Authority property that it does not. This is why I have added UriPart.AuthorityWithUserInfo to represent a proper definition of Authority.
Meanwhile, I've maintained the Authority part as containing only host and port in order to have naming consistency with the Uri class.
Other than these differences, GetPart has similar capabilities as Uri.GetComponents. Use either as it suits your purpose.
GetPart Examples
Notice that none of the results in the following calls to GetPart have leading or trailing delimiters, thus yielding a more consistent and predictable result.
using Cambia;
Uri u = new Uri("http://www.domain.com/path?a=hello&b=world#fragment");
Assert.AreEqual("http", u.GetPart(UriPart.Scheme));
Assert.AreEqual("www.domain.com", u.GetPart(UriPart.Authority));
Assert.AreEqual("www.domain.com", u.GetPart(UriPart.AuthorityWithUserInfo));
Assert.AreEqual("", u.GetPart(UriPart.UserInfo));
Assert.AreEqual("www.domain.com", u.GetPart(UriPart.Host));
Assert.AreEqual("path", u.GetPart(UriPart.Path));
Assert.AreEqual("path?a=hello&b=world", u.GetPart(UriPart.PathAndQuery));
Assert.AreEqual("a=hello&b=world", u.GetPart(UriPart.Query));
Assert.AreEqual("fragment", u.GetPart(UriPart.Fragment));
u = new Uri("ftp://steve:@ftpsite.com:555/path1/path2");
Assert.AreEqual("ftp", u.GetPart(UriPart.Scheme));
Assert.AreEqual("ftpsite.com:555", u.GetPart(UriPart.Authority));
Assert.AreEqual("steve@ftpsite.com:555", u.GetPart(UriPart.AuthorityWithUserInfo));
Assert.AreEqual("steve", u.GetPart(UriPart.UserInfo));
Assert.AreEqual("ftpsite.com", u.GetPart(UriPart.Host));
Assert.AreEqual("555", u.GetPart(UriPart.Port));
Assert.AreEqual("ftpsite.com:555", u.GetPart(UriPart.HostAndPort));
Assert.AreEqual("path1/path2", u.GetPart(UriPart.Path));
Assert.AreEqual("path1/path2", u.GetPart(UriPart.PathAndQuery));
Assert.AreEqual("", u.GetPart(UriPart.Query));
Assert.AreEqual("", u.GetPart(UriPart.Fragment));
ToUrlType Method Intro
As with the GetPart method, the ToUrlType method takes an enum argument. In this case, it's fittingly called UrlType.
using Cambia;
public enum UrlType
{
None,
// Starts with a scheme and is a full, complete URI including everything that follows.
AbsoluteUri,
// Starts with a single slash. Includes the path and everything that follows.
RootRelative,
// Starts with two slashes. Includes everything except the scheme and its trailing colon.
SchemeRelative,
// Starts with a tilde. Relative to the application root. Only available in an
// ASP.NET context.
AppRelative
}
ToUrlType Examples
Let's look at some examples.
using Cambia;
Uri u = new Uri("http://www.domain.com/folder/app/file.aspx?a=hello");
Assert.AreEqual("http://www.domain.com/folder/app/file.aspx?a=hello", u.ToUrlType(UrlType.AbsoluteUri));
Assert.AreEqual("/folder/app/file.aspx?a=hello", u.ToUrlType(UrlType.RootRelative));
Assert.AreEqual("//www.domain.com/folder/app/file.aspx?a=hello", u.ToUrlType(UrlType.SchemeRelative));
// If in an ASP.NET context, the following will work as well
Assert.AreEqual("~/file.aspx?a=hello", u.ToUrlType(UrlType.AppRelative));
GetQueryItem Examples
In ASP.NET the Request object provides nice access to query string values, but this feature is noticeably absent from the Uri. This is partly because only certain scheme types actually support query items, but it would be nice to have easy access to query items from a Uri when you need it without having to write your own parsing code.
System.Web.HttpUtility.ParseQueryString(query) is a nice little method in the .NET framework that will extract the query string keys and values into a NameValueCollection. However, it is case insensitive which is an unfortunate assumption since
query items are actually case sensitive.
We've created our own static parsing routine on UriBuilderExtensions.ParseQueryItems(...). We use it under the covers and it
is optionally case sensitive. You can call it directly if you like.
But, to make this easier for you, Cambia.UriExtensions adds basic support for getting query string values through the GetQueryItem extension method.
NOTE: This method is not supported for relative URIs or schemes which don't support query strings. Obviously, this feature is most useful with http and https schemes.
All of our extension methods relating to query items are, by default, case sensitive!
KEY=val is not the same as key=val.
using Cambia;
Uri u = new Uri("http://host?a=hello&b=world");
Assert.AreEqual("hello", u.GetQueryItem("a"));
Assert.AreEqual("world", u.GetQueryItem("b"));
Assert.AreEqual(null, u.GetQueryItem("c"));
u = new Uri("http://host?a=hello&b=world&A=goodbye");
Assert.AreEqual("hello", u.GetQueryItem("a"));
Assert.AreEqual("world", u.GetQueryItem("b"));
Assert.AreEqual("goodbye", u.GetQueryItem("A"));
Path Segments and Path Items Intro
The .NET Uri class does not offer much help when working with the Path portion of a URI.
That's why I've added a UriPath class to the Cambia.UriExtensions library. It provides an easy way to work
with path segments and path items
So, Steve. What are path segments and path items?
If you split a URI path on the slashes, then you end up with the segments. Unless, of course, you're the .NET Uri class. There is a Segments property on the Uri class. It's a string array, but it has a big problem in my opinion. It contains all of the slashes. The first segment is always a slash and subsequent segments contain a trailing slash.
What a pain when all you want is the content between the slashes.
NOTE: I understand why they designed the .NET Uri class they way they did. They wanted to have all the characters
which were present in the original URI to also be present in the various parts that are presented to the user.
You could simply join the segments and end up with the original path. This makes some sense for some scenarios. Unfortunately, in practice, I prefer my segments without slashes.
On to Path Items. I will admit that the term Path Item is something I've invented.
Path parameters was taken and means something a bit different.
With all the web frameworks today supporting URL routing in various forms, a lot more content
is being put in the URL path rather than in query items. It can make for cleaner URLs.
For example, a page number could be added in either a query string or in a path as follows:
http://www.domain.com/blog?page=2
http://www.domain.com/blog/page/2
If you'll look at that last URL where the page is a segment in the path, notice that page is like a key word and the 2 in the following segment is like the value.
This pair of adjacent segments is called a Path Item. The first segment is the key and the next one is a value.
That's it! That is the extent of the definition of a path item.
You'll see this definition has some consequences.
For example, blog/page is also a Path Item. In fact, all of the following pairs are Path Items:
- {blog,page}
- {page,2}
- {2,}
Obviously, some of those are not key/value pairs and don't make sense. It is up to you, the user, to know
which pairs are key/value pairs and deal with them accordingly.
Our UriPath class makes this easy.
Plus, we've also added extension methods to the Uri class.
Path Segments Using the UriPath Class
Let's use the following URL as our sample URI:
https://www.cambiaresearch.com/blog/page/2
We can see that the path is
/blog/page/2
and that the segments are
// Instantiate UriPath
Uri u = new Uri("https://www.cambiaresearch.com/blog/page/2");
UriPath p = new UriPath(u); // Instantiate with a Uri instance
p = new UriPath(u.GetPart(UriPart.Path)); // Or with a Path string
// Number of segments
Assert.AreEqual(3, p.SegmentCount);
// Get individual segments with indexer.
// No annoying slashes in result.
Assert.AreEqual("blog", p[0]);
Assert.AreEqual("page", p[1]);
Assert.AreEqual("2", p[2]);
Assert.AreEqual(null, p[3]);
// Set, Insert and Drop segments
p.SetSegment(0, "list");
Assert.AreEqual("/list/page/2", p.ToString());
p.InsertSegment(0, "hello");
Assert.AreEqual("/hello/list/page/2", p.ToString());
p.DropSegment(1);
Assert.AreEqual("/hello/page/2", p.ToString());
p.DropSegment("hello");
Assert.AreEqual("/page/2", p.ToString());
// ToString overload allows you to control
// leading and trailing slashes in your output
Assert.AreEqual("page/2/", p.ToString(false, true));
// Finally, let's update our Uri with the new path
u = u.SetPath(p);
Assert.AreEqual("https://www.cambiaresearch.com/page/2", u.ToUrlType(UrlType.AbsoluteUri));
Path Items Using the UriPath Class
As mentioned above, path items are pairs of adjacent segments in the path where the first
of the two is the key and the second is the value.
// Instantiate UriPath
Uri u = new Uri("https://www.cambiaresearch.com/blog/page/2");
UriPath p = new UriPath(u); // Instantiate with a Uri instance
Assert.AreEqual("/blog/page/2", p.ToString());
// Get the page value
Assert.AreEqual("2", p["page"]);
// But remember that every pair of segments is really
// a path item even if they weren't meant to be.
Assert.AreEqual("page", p["blog"]);
Assert.AreEqual("", p["2"]);
// If a path item key can't be found the result is null
Assert.AreEqual(null, p["steve"]);
// PathItemsExists checks to see if the segment (key) exists
Assert.AreEqual(true, p.PathItemExists("page"));
Assert.AreEqual(false, p.PathItemExists("steve"));
// Set (updates or appends)
p.SetPathItem("page", "99");
Assert.AreEqual("/blog/page/99", p.ToString());
// Set and move to end. Notice that we operating on
// the blog/page path item. This doesn't make much
// sense, but you see that page gets replaced by stuff
// and blog/stuff goes to the end.
p.SetPathItem("blog", "stuff", true);
Assert.AreEqual("/99/blog/stuff", p.ToString());
// Set coalesces duplicates into one, putting the path item
// at the position of the right most occurrence
p = new UriPath("/blog/page/1/stuff/page/2/morestuff");
p.SetPathItem("page", "99");
Assert.AreEqual("/blog/stuff/page/99/morestuff", p.ToString());
// Append
p.AppendPathItem("key", "value");
Assert.AreEqual("/blog/stuff/page/99/morestuff/key/value", p.ToString());
// Drop
p.DropPathItem("key");
p.DropPathItem("blog");
Assert.AreEqual("/page/99/morestuff", p.ToString());
// Insert
p.InsertPathItem("key", "value", 2);
Assert.AreEqual("/page/99/key/value/morestuff", p.ToString());
// Finally, update the original Uri (if desired)
u = u.SetPath(p);
Assert.AreEqual("https://www.cambiaresearch.com/page/99/key/value/morestuff", u.AbsoluteUri);
Path Items Using the Uri Class
In addition to the UriPath class discussed above, we've added a few extension methods directly to the Uri
class so that you don't always have to instantiate a separate instance of the UriPath class.
Here are some examples:
Uri u = new Uri("http://host/blog/page/2");
// Get a path item value
Assert.AreEqual("2", u.GetPathItem("page"));
// Determine whether a path item exists in the path
Assert.AreEqual(true, u.PathItemExists("page"));
// Set (update or append)
u = u.SetPathItem("page", "99");
Assert.AreEqual("blog/page/99", u.GetPart(UriPart.Path));
u = u.SetPathItem("id", "B54FA1");
Assert.AreEqual("blog/page/99/id/B54FA1", u.GetPart(UriPart.Path));
// Drop
u = u.DropPathItem("id");
Assert.AreEqual("blog/page/99", u.GetPart(UriPart.Path));
Paging
Paging is common in websites and is usually managed by having a page number somewhere in the URL to
indicate which portion of a set should be loaded.
Suppose you're showing a list of 20 blog post summaries, but you can only show five at a time. You would
therefore have four pages of summaries: 1, 2, 3, 4.
Sometimes the page is indicated in the query string
http://domain.com/blog?page=2
Sometimes in the path
http://domain.com/blog/page/2
If in the query, you would say that you have a query item whose key is page.
If in the path, well, I'm calling it a path item whose key is page. See above for
details on path items and working with them.
Page is a special instance of either a query item or a path item.
I've added some extension methods to the Uri class for this special case where the key is page
and the value is an integer number.
// Get the page value
Uri u = new Uri("http://host/blog/page/2");
Assert.AreEqual(2, u.GetPage());
// If there are duplicate page keys, the right most one
// takes precedence
u = new Uri("http://host/page/1/blog/page/2?page=3");
Assert.AreEqual(3, u.GetPage());
// If there's no page key, the result is -1
u = new Uri("http://host/blog");
Assert.AreEqual(-1, u.GetPage());
// Set
u = u.SetPageInQuery(3);
Assert.AreEqual("/blog?page=3", u.PathAndQuery);
u = u.SetPageInPath(2);
Assert.AreEqual("/blog/page/2?page=3", u.PathAndQuery);
// Ensure page - puts in query if there's already a query or if the last
// path segment looks like a file. Otherwise, uses the path
u = new Uri("http://host/blog");
u = u.EnsurePage(2);
Assert.AreEqual("/blog/page/2", u.PathAndQuery);
u = new Uri("http://host/blog/file.aspx");
u = u.EnsurePage(2);
Assert.AreEqual("/blog/file.aspx?page=2", u.PathAndQuery);
u = new Uri("http://host/blog?a=b");
u = u.EnsurePage(2);
Assert.AreEqual("/blog?a=b&page=2", u.PathAndQuery);
// Drop
u = new Uri("http://host/page/1/blog/page/2?page=3");
u = u.DropPageFromQuery();
Assert.AreEqual("/page/1/blog/page/2", u.PathAndQuery);
u = u.DropPageFromPath();
Assert.AreEqual("/blog", u.PathAndQuery);
u = new Uri("http://host/page/1/blog/page/2?page=3");
u = u.DropPage();
Assert.AreEqual("/blog", u.PathAndQuery);
Drop and Set Chainable Methods
So far we've discussed three extensions methods: GetPart, ToUrlType and GetQueryItem. All of these are about extracting information from the URI or formatting it.
I would now like to introduce several methods which support building and modifying of URIs.
These methods form what's called a fluent interface which allows you to chain method calls.
Allow me to demonstrate.
Set Methods
Let's say we want to build a URL from scratch and we really don't want to mess with all those delimiters.
NOTE: UriBuilder is intended for building URIs. In truth, it's better than these extension methods if you are building a URI from scratch. These extension methods will be used more for easily modifying an existing Uri. But, you _could_ sort of build it from scratch as follows.
using Cambia;
// Build a URL using Set methods
// The Uri class requires that we start with a valid URI so we use the simplest
// thing we can.
Uri u = new Uri("junk://junk");
u = u.SetScheme("http")
.SetUser("steve")
.SetHost("host")
.SetPort(555)
.SetPath("/folder/app/file.aspx")
.SetQueryItem("a", "hello")
.SetQueryItem("b", "world")
.SetFragment("fraggle");
// Here's what we produced
Assert.AreEqual("http://steve@host:555/folder/app/file.aspx?a=hello&b=world#fraggle",
u.ToUrlType(UrlType.AbsoluteUri));
// And guess what, most of the inputs can contain leading and trailing delimiters (or not)
// so that you really don't have to worry about them. You can use Uri properties as
// inputs without having to remember whether the Uri.Fragment has a hash or not, etc.
u = u.SetScheme("http:")
.SetUser("steve:")
.SetHost("@host:")
.SetPort(555)
.SetPath("folder/app/file.aspx/")
.SetQueryItem("a", "hello")
.SetQueryItem("b", "world")
.SetFragment("#fraggle");
// The result is the same
Assert.AreEqual("http://steve@host:555/folder/app/file.aspx?a=hello&b=world#fraggle",
u.ToUrlType(UrlType.AbsoluteUri));
Drop Methods
The Drop methods are similar to the set methods, but instead of adding or modifying URI parts, they remove them. Only certain parts can be removed. Scheme and host for example are required elements of an absolute URI.
using Cambia;
// Drop examples
Uri u = new Uri("http://steve@host:555/path?a=hello#fragment");
u = u.DropUserInfo()
.DropPort()
.DropPath()
.DropQuery();
// The resulting URI after dropping several parts
Assert.AreEqual("http://host/#fragment", u.ToUrlType(UrlType.AbsoluteUri));
Set and Drop Methods in Combination
Finally, Set and Drop methods can be used together just as we've demonstrated above to modify a URI.
using Cambia;
Uri u = new Uri("http://steve@host:555/path?a=hello");
u = u.DropUserInfo()
.SetHost("cambiaresearch.com")
.DropPort()
.SetQueryItem("a", "goodbye") // modify existing query item
.SetQueryItem("b", "world"); // append a new query item
Assert.AreEqual("http://cambiaresearch.com/path?a=goodbye&b=world",
u.ToUrlType(UrlType.AbsoluteUri));
A Quick Word about Relative URIs
Support for relative URIs, those which have no scheme or authority, is minimal in the Uri class. Most of these extension methods described in this article won't work with relative URIs. If you fill a Uri object with a relative Uri, most of the normal operations and properties will throw exceptions.
Because of this, our extension methods are limited in the same ways as the Uri class itself. Many of these methods will throw exceptions when operating on URIs because the methods need to access parts or properties that are not supported by the Uri class.