Age | Commit message (Collapse) | Author |
|
For example, "[::1", where the square bracket at
the end is missing.
Handle truncated host element by checking the
state we end up in after the entire string is parsed.
Some states represent interal elements of a host name
or address, and so if we're still in those states and
run out of input characters, the input string was
cut off early.
|
|
Don't include the square brackets in the parsed out
host string; they are only there for delimiting them
inside of an overall URI string.
|
|
* Add ValidateIpv6Address.
* Add ValidateIpv4Address (since an IPv6 address is
allowed to contain an IPv4 address for compatibility)
* Add ValidateOctet (used by ValidateIpv4Address).
|
|
* Extract CanNavigatePathUpOneLevel from NormalizePath.
* Add comments to explain what's going on elsewhere
in NormalizePath.
|
|
|
|
|
|
* Extract methods to copy various elements of one URI from another.
* Push NormalizePath implementation into a private method.
* Simplify and consolidate checks for absolute paths.
* Extract methods out of individual steps of ParseFromString.
|
|
Add comments that link parts of the code back
to lines of the pseudocode in the RFC,
to make the code easier to understand.
|
|
The former algorithm was based on the pseuocode
from the RFC, which is hard to follow, more suitable
when the path is in a single string, not a sequence
of segments.
The new algorithm uses two flags:
* isAbsolute - recognize that if the path starts out
as an absolute path, it needs to stay that way.
* atDirectoryLevel - recognize that if we encounter
a "." or "..", then it will be reduced by simply
discarding it or going back/up one stop, but then
we will be in a "directory" context, meaning that
should we end the path at this point, there needs
to be an empty-string segment to mark that the
end of the path is reaching into a directory, not
just referring to the directory.
|
|
Path normalization is hideously broken for now.
|
|
|
|
Such a URI should be considered equivalent to a path of "/"
because in both cases the path is an absolute path.
|
|
For normalization "step 2C", if the output path was
empty, we don't want to pop the end of it off.
|
|
* Code the neat example in section 6.2.2 of the RFC.
* Add equality/inequality operators for Uri.
|
|
|
|
Extract methods that parse the query and fragment.
|
|
* Replaced the more formal "state machine" used in URI
elements that may have percent-encoded characters, with
a simpler loop with a flag and a few conditional logic
paths.
* Extracted the parsing of the above types of elements into
a common method, DecodeElement.
* Kept DecodeQueryOrFragment around, in order to prevent
having to repeat the name of the allowed character set which
is common between query and fragment; however the function
is now just a very thin wrapper.
|
|
* Remove IsCharacterInSet function
|
|
|
|
|
|
|
|
Added CharacterSet as a class to represent character sets,
allowing us to build singletons and composite character sets
more concisely.
|
|
* Extract IsCharacterInSet to its own module.
* Extract PercentEncodedCharacterDecoder to its own module.
|
|
Remove state 3 hole in host/port parsing state machine
|
|
Extract percent-encoded character decoding, so that
the logic is all in one class that is reused.
|
|
|
|
|
|
|
|
Path may also have colon, so make sure we don't scan
into the path element if there is one.
|
|
* Detect bad characters in host names.
* Incorporate splitting host and port into the state
machine that is parsing/decoding the host.
NOTE:
IPv6address is not checked for bad characters yet.
More research is needed to learn exactly what are
the various ways to write an IPv6 address.
|
|
A colon may be in the authority, if present, so limit
the search for scheme delimiter so we aren't scanning
the authority part, when parsing the scheme.
|
|
|
|
Extracted IsCharacterInSet function
|
|
|
|
Extract method ParseAuthority
|
|
Extract method that parses the path segments from
the whole path string.
|
|
* Extract function that parses 16-bit unsigned integers,
to use in parsing port element.
* Clean up and clarify what parts of the original URI
string are still being held onto at various points
in the code.
|
|
|
|
* Add IsRelativeReference.
* Add IsRelativePath.
* Add Query.
* Add Fragment.
* Add UserInfo.
* Fix parsing of URIs that have no scheme.
|
|
|
|
* Parts of a path are called "segments", not "steps",
in the RFC.
* The RFC specifies that path separators are always
forward slashes, so don't support other separators.
|
|
* Can now parse URIs from strings.
* This supports scheme, host, and path.
* Path separator defaults to "/" but may be customized.
|
|
|