silverorange Code


Swat.SwatString
/Swat/SwatString.php at line 16

Class SwatString

SwatObject
└─SwatString

public class SwatString
extends SwatObject

String Tools

Copyright:
2005-2007 silverorange
License:
http://www.gnu.org/copyleft/lesser.html LGPL License 2.1

Field Summary
static array

$blocklevel_elements

Block level XHTML elements used when filtering strings.

static array

$breaking_elements

These XHTML elements are not block-level but people often write markup treating these elements as block-level tags.

static array

$preformatted_elements

XHTML elements where the content is pre-formatted and should not be modified.

static array

$table_elements

These XHTML elements are used for tables.

static array

$xhtml_elements

All XHTML elements.

Constructor Summary

SwatString()

Don't allow instantiation of the SwatString object.

Method Summary
static string

byteFormat(integer value, integer magnitude, boolean iec_units, integer significant_digits)

Format bytes in human readible units.

static string

condense(string text, integer max_length)

Takes a block of text and condenses it into a small fragment of XHTML.

static string

condenseToName(string string, integer max_length)

Condenses a string to a name.

static string

ellipsizeMiddle(string string, integer max_length, string ellipses, mixed flag, boolean &$flag)

Ellipsizes a string in the middle.

static string

ellipsizeRight(string string, integer max_length, string ellipses, mixed flag, boolean &$flag)

Ellipsizes a string to the right.

private static void

getDecimalPrecision(mixed value)

static string

getInternationalCurrencySymbol(string locale)

Gets the international currency symbol of a locale.

static string

getSalt(integer length)

Gets a salt value of the specified length.

static string

hash(string string)

Gets a unique hash of a string.

private static void

insertEntities(string string, array matches, integer hole_start, integer hole_end, integer hole_length)

Re-inserts stripped entities into a string in the correct positions.

static string

linkify(string string)

Replaces all URI's in a string with anchor markup tags.

static string

minimizeEntities(string text)

Converts a UTF-8 text string to have the minimal number of entities necessary to output it as valid UTF-8 XHTML without ever double-escaping.

static string

minimizeEntitiesWithTags(string text, array tags)

Same as SwatString::minimizeEntities() but also accepts a list of tags to preserve.

static string

moneyFormat(float value, string locale, boolean display_currency, integer decimal_places)

Formats a numeric value as currency.

static string

numberFormat(float value, integer decimals, string locale, boolean show_thousands_separator)

Formats a number using locale-based separators.

static void

ordinalNumberFormat(integer value)

Formats an integer as an ordinal number (1st, 2nd, 3rd).

static string

pad(string input, int pad_length, string pad_string, int pad_type)

Pads a string in a UTF-8 safe way.

private static void

parseNegativeNotation(mixed string)

static string

quoteJavaScriptString(string string)

Safely quotes a PHP string into a JavaScript string.

static string

removeLeadingPunctuation(string string)

Removes leading punctuation from a string.

static string

removePunctuation(string string)

Removes both leading and trailing punctuation from a string.

static string

removeTrailingPunctuation(string string)

Removes trailing punctuation from a string.

static string

signedSerialize(mixed data, string salt)

Serializes and signs a value using a salt.

static mixed

signedUnserialize(string data, string salt)

Unserializes a signed serialized value.

private static void

stripEntities(string string, array matches)

Strips entities from a string remembering their positions.

static string

stripXHTMLTags(string string)

Removes all XHTML tags from a string.

static float

toFloat(string string)

Convert a locale-formatted number and return it as an float.

static integer

toInteger(string string)

Convert a locale-formatted number and return it as an integer.

static string

toList(array|Iterator iterator, string conjunction, string delimiter, boolean display_final_delimiter)

Convert an iterable object or array into a human-readable, delimited list.

static string

toXHTML(string text)

Intelligently converts a text block to XHTML.

static boolean

validateUtf8(string string)

Checks whether or not a string is valid UTF-8.

Field Detail

/Swat/SwatString.php at line 25

blocklevel_elements

public static array $blocklevel_elements = array(...)

Block level XHTML elements used when filtering strings


/Swat/SwatString.php at line 37

breaking_elements

public static array $breaking_elements = array( 'li', 'dd', 'dt', )

These XHTML elements are not block-level but people often write markup treating these elements as block-level tags


/Swat/SwatString.php at line 82

preformatted_elements

public static array $preformatted_elements = array(...)

XHTML elements where the content is pre-formatted and should not be modified


/Swat/SwatString.php at line 46

table_elements

public static array $table_elements = array(...)

These XHTML elements are used for tables


/Swat/SwatString.php at line 58

xhtml_elements

public static array $xhtml_elements = array(...)

All XHTML elements

Taken from http://www.w3.org/TR/html4/index/elements.html.


Constructor Detail

/Swat/SwatString.php at line 1740

SwatString

public SwatString()

Don't allow instantiation of the SwatString object

This class contains only static methods and should not be instantiated.


Method Detail

/Swat/SwatString.php at line 1001

byteFormat

public static string byteFormat(integer value, integer magnitude, boolean iec_units, integer significant_digits)

Format bytes in human readible units

By default, bytes are formatted using canonical, ambiguous, base-10 prefixed units. Bytes may optionally be formatted using unambiguous IEC standard binary prefixes. See the National Institute of Standards and Technology's page on binary unit prefixes at http://physics.nist.gov/cuu/Units/binary.html for details.

Parameters:
value - the value in bytes to format.
magnitude - optional. The power of 2 to use as the unit base. This value will be rounded to the nearest ten if specified. If less than zero or not specified, the highest power less than $value will be used.
iec_units - optional. Whether or not to use IEC binary multiple prefixed units (Mebibyte). Defaults to using canonical units.
significant_digits - optional. The number of significant digits in the formatted result. If null, the value will be rounded and formatted one fractional digit. Otherwise, the value is rounded to the specified the number of digits. By default, this is three. If there are more integer digits than the specified number of significant digits, the value is rounded to the nearest integer.
Returns:
the byte value formated according to IEC units.

/Swat/SwatString.php at line 359

condense

public static string condense(string text, integer max_length)

Takes a block of text and condenses it into a small fragment of XHTML.

Condensing text removes inline XHTML tags and replaces line breaks and block-level elements with special characters.

Parameters:
text - the text to be condensed.
max_length - the maximum length of the condensed text. If null is specified, there is no maximum length.
Returns:
the condensed text. The condensed text is an XHTML formatted string.

/Swat/SwatString.php at line 437

condenseToName

public static string condenseToName(string string, integer max_length)

Condenses a string to a name

The generated name can be used for things like database identifiers and site URI fragments.

Example:

<?php
$string 
'The quick brown fox jumped over the lazy dogs.';
// displays 'thequickbrown'
echo SwatString::condenseToName($string);
?>

Parameters:
string - the string to condense to a name.
max_length - the maximum length of the condensed name in characters.
Returns:
the string condensed into a name.

/Swat/SwatString.php at line 599

ellipsizeMiddle

public static string ellipsizeMiddle(string string, integer max_length, string ellipses, mixed flag, boolean &$flag)

Ellipsizes a string in the middle

The length of a string is calculated as the number of visible characters This method will properly account for any XHTML entities that may be present in the given string.

Example:

<?php
$string 
'The quick brown fox jumped over the lazy dogs.';
// displays 'The quick ... dogs.'
echo SwatString::ellipsizeMiddle($string18' ... ');
?>

XHTML example:

<?php
$string 
'The &#8220;quick&#8221 brown fox jumped over the lazy dogs.';
// displays 'The &#8220;quick&#8221; ... dogs.'
echo SwatString::ellipsizeMiddle($string18' ... ');
?>

Parameters:
string - the string to ellipsize.
max_length - the maximum length of the returned string. This length does not account for any ellipse characters that may be appended.
ellipses - the ellipses characters to insert if the string is shortened. By default, this is a unicode ellipses character padded by non-breaking spaces.
&$flag - an optional boolean flag passed by reference to the ellipsize function. If the given string is ellipsized, the flag is set to true. If no ellipsizing takes place, the flag is set to false.
Returns:
the ellipsized string. The ellipsized string may include ellipses characters in roughly the middle if it was longer than $max_length.

/Swat/SwatString.php at line 520

ellipsizeRight

public static string ellipsizeRight(string string, integer max_length, string ellipses, mixed flag, boolean &$flag)

Ellipsizes a string to the right

The length of a string is calculated as the number of visible characters This method will properly account for any XHTML entities that may be present in the given string.

Example:

<?php
$string 
'The quick brown fox jumped over the lazy dogs.';
// displays 'The quick brown ...'
echo SwatString::ellipsizeRight($string18' ...');
?>

XHTML example:

<?php
$string 
'The &#8220;quick&#8221; brown fox jumped over the lazy dogs.';
// displays 'The &#8220;quick&#8221; brown ...'
echo SwatString::ellipsizeRight($string18' ...');
?>

Parameters:
string - the string to ellipsize.
max_length - the maximum length of the returned string. This length does not account for any ellipse characters that may be appended. If the returned value must be below a certain number of characters, pass a blank string in the ellipses parameter.
ellipses - the ellipses characters to append if the string is shortened. By default, this is a non-breaking space followed by a unicode ellipses character.
&$flag - an optional boolean flag passed by reference to the ellipsize function. If the given string is ellipsized, the flag is set to true. If no ellipsizing takes place, the flag is set to false.
Returns:
the ellipsized string. The ellipsized string may be appended with ellipses characters if it was longer than $max_length.

/Swat/SwatString.php at line 1722

getDecimalPrecision

private static void getDecimalPrecision(mixed value)

/Swat/SwatString.php at line 824

getInternationalCurrencySymbol

public static string getInternationalCurrencySymbol(string locale)

Gets the international currency symbol of a locale

Parameters:
locale - optional. Locale to get the international currency symbol for. If no locale is specified, the current locale is used.
Returns:
the international currency symbol for the specified locale. The symbol is UTF-8 encoded and does not include the spacing character specified in the C99 standard.
Throws:
if the given locale could not be set.

/Swat/SwatString.php at line 1328

getSalt

public static string getSalt(integer length)

Gets a salt value of the specified length

Useful for securing passwords or other one-way encrypted fields that may be succeptable to a dictionary attack.

This method generates a random ASCII string of the specified length. All ASCII characters except the null character (0x00) may be included in the returned string.

Parameters:
length - the desired length of the salt.
Returns:
a salt value of the specified length.

/Swat/SwatString.php at line 1296

hash

public static string hash(string string)

Gets a unique hash of a string

The hashing is as unique as md5 but the hash string is shorter than md5. This method is useful if hash strings will be visible to end-users and shorter hash strings are desired.

Parameters:
string - the string to get the unique hash for.
Returns:
the unique hash of the given string. The returned string is safe to use inside a URI.

/Swat/SwatString.php at line 1643

insertEntities

private static void insertEntities(string string, array matches, integer hole_start, integer hole_end, integer hole_length)

Re-inserts stripped entities into a string in the correct positions

The first two parameters are passed by reference and nothing is returned by this function.

Parameters:
string - the string to re-insert entites into.
matches - the array of stored matches.
hole_start - ignore inserting entities between here and hole_end.
hole_end - ignore inserting entities between here and hole_start.
hole_length - the length of the new contents of the hole.

/Swat/SwatString.php at line 1373

linkify

public static string linkify(string string)

Replaces all URI's in a string with anchor markup tags

This method does not know if a URI is already inside markup so it is best to only use it on plain text.

Only "http" and "https" URI's are currently supported.

Parameters:
string - the string to replace URI's in.
Returns:
the given string with all URI's wrapped in anchor tags.

/Swat/SwatString.php at line 295

minimizeEntities

public static string minimizeEntities(string text)

Converts a UTF-8 text string to have the minimal number of entities necessary to output it as valid UTF-8 XHTML without ever double-escaping.

The text is converted as follows:

Parameters:
text - the UTF-8 text string to convert.
Returns:
the UTF-8 text string with minimal entities.

/Swat/SwatString.php at line 319

minimizeEntitiesWithTags

public static string minimizeEntitiesWithTags(string text, array tags)

Same as SwatString::minimizeEntities() but also accepts a list of tags to preserve.

Parameters:
text - the UTF-8 text string to convert.
tags - names of tags that should be preserved.
Returns:
the UTF-8 text string with minimal entities.

/Swat/SwatString.php at line 762

moneyFormat

public static string moneyFormat(float value, string locale, boolean display_currency, integer decimal_places)

Formats a numeric value as currency

Note: This method does not work in some operating systems and in such cases, this method will throw an exception.

Note: This method is deprecated. Use SwatI18NLocale::formatCurrency() instead. The newer method is more flexible and works across more platforms.

Parameters:
value - the numeric value to format.
locale - optional locale to use to format the value. If no locale is specified, the current locale is used.
display_currency - optional flag specifing whether or not the international currency symbol is appended to the output. If not specified, the international currency symbol is omitted from the output.
decimal_places - optional number of decimal places to display. If not specified, the locale's default number of decimal places is used.
Returns:
a UTF-8 encoded string containing the formatted currency value.
Throws:
if the PHP money_format() function is undefined.
if the given locale could not be set.
if the locale-based output cannot be converted to UTF-8.
Deprecated:
Use {@link SwatI18NLocale::formatCurrency()} instead. It is more flexible and works across more platforms.

/Swat/SwatString.php at line 880

numberFormat

public static string numberFormat(float value, integer decimals, string locale, boolean show_thousands_separator)

Formats a number using locale-based separators

Parameters:
value - the numeric value to format.
decimals - number of decimal places to display. By default, the full number of decimal places of the value will be displayed.
locale - an optional locale to use to format the value. If no locale is specified, the current locale is used.
show_thousands_separator - whether or not to display the thousands separator (default is true).
Returns:
a UTF-8 encoded string containing the formatted number.
Throws:
if the given locale could not be set.
if the locale-based output cannot be converted to UTF-8.

/Swat/SwatString.php at line 933

ordinalNumberFormat

public static void ordinalNumberFormat(integer value)

Formats an integer as an ordinal number (1st, 2nd, 3rd)

This method uses English suffixes and is not translatable.

Parameters:
value - the numeric value to format.

/Swat/SwatString.php at line 1090

pad

public static string pad(string input, int pad_length, string pad_string, int pad_type)

Pads a string in a UTF-8 safe way.

Parameters:
input - the string to pad.
pad_length - length in characters to pad to.
pad_string - string to use for padding.
pad_type - type of padding to use: STR_PAD_LEFT, STR_PAD_RIGHT, or STR_PAD_BOTH.
Returns:
the padded string.

/Swat/SwatString.php at line 1689

parseNegativeNotation

private static void parseNegativeNotation(mixed string)

/Swat/SwatString.php at line 1456

quoteJavaScriptString

public static string quoteJavaScriptString(string string)

Safely quotes a PHP string into a JavaScript string

Strings are always quoted using single quotes. The characters documented at http://code.google.com/p/doctype/wiki/ArticleXSSInJavaScript are escaped to prevent XSS attacks.

Parameters:
string - the PHP string to quote as a JavaScript string.
Returns:
the quoted JavaScript string. The quoted string is wrapped in single quotation marks and is safe to display in inline JavaScript.

/Swat/SwatString.php at line 704

removeLeadingPunctuation

public static string removeLeadingPunctuation(string string)

Removes leading punctuation from a string

Parameters:
string - the string to format remove punctuation from.
Returns:
the string with leading punctuation removed.

/Swat/SwatString.php at line 719

removePunctuation

public static string removePunctuation(string string)

Removes both leading and trailing punctuation from a string

Parameters:
string - the string to format remove punctuation from.
Returns:
the string with leading and trailing punctuation removed.

/Swat/SwatString.php at line 689

removeTrailingPunctuation

public static string removeTrailingPunctuation(string string)

Removes trailing punctuation from a string

Parameters:
string - the string to format remove punctuation from.
Returns:
the string with trailing punctuation removed.

/Swat/SwatString.php at line 1397

signedSerialize

public static string signedSerialize(mixed data, string salt)

Serializes and signs a value using a salt

By signing serialized data, it is possible to detect tampering of serialized data. This is useful if serialized data is accepted from user editable $_GET, $_POST or $_COOKIE data.

Parameters:
data - the data to serialize.
salt - the signature salt.
Returns:
the signed serialized value.
See Also:
SwatString::signedSerialize()

/Swat/SwatString.php at line 1422

signedUnserialize

public static mixed signedUnserialize(string data, string salt)

Unserializes a signed serialized value

Parameters:
data - the signed serialized data.
salt - the signature salt. This must be the same salt value used to serialize the value.
Returns:
the unserialized value.
Throws:
if the signed serialized data has been tampered with.
See Also:
SwatString::signedSerialize()

/Swat/SwatString.php at line 1617

stripEntities

private static void stripEntities(string string, array matches)

Strips entities from a string remembering their positions

Stripped entities are replaces with a single special character. All parameters are passed by reference and nothing is returned by this function.

Parameters:
string - the string to strip entites from.
matches - the array to store matches in.

/Swat/SwatString.php at line 1352

stripXHTMLTags

public static string stripXHTMLTags(string string)

Removes all XHTML tags from a string

This method is similar to the built-in strip_tags function in PHP but this method only strips XHTML tags. All other tags are left intact.

Parameters:
string - the string to remove XHTML tags from.
Returns:
the given string with all XHTML tags removed.

/Swat/SwatString.php at line 1199

toFloat

public static float toFloat(string string)

Convert a locale-formatted number and return it as an float.

If the string is not an float, the method returns null.

Parameters:
string - the string to convert.
Returns:
The converted value.

/Swat/SwatString.php at line 1150

toInteger

public static integer toInteger(string string)

Convert a locale-formatted number and return it as an integer.

If the string can not be converted to an integer, the method returns null. If the number has values after the decimal point, the value is rounded according to the rounding rules for PHP's intval function.

If the number is too large to fit in PHP's integer range (depends on system architecture), an exception is thrown.

Parameters:
string - the string to convert.
Returns:
the converted value or null if it could not be converted.
Throws:
if the converted number is too large to fit in an integer.

/Swat/SwatString.php at line 1246

toList

public static string toList(array|Iterator iterator, string conjunction, string delimiter, boolean display_final_delimiter)

Convert an iterable object or array into a human-readable, delimited list.

Parameters:
iterator - the object to convert to a list.
conjunction - the list's conjunction. Usually 'and' or 'or'.
delimiter - the list delimiter. If list items should additionally be padded with a space, the delimiter should also include the space.
display_final_delimiter - whether or not the final list item should be separated from the list with a delimiter.
Returns:
The formatted list.
Throws:
if the iterator value is not an array or Iterator
Todo:
Think about using a mask to make this as flexible as possible for different locales.

/Swat/SwatString.php at line 103

toXHTML

public static string toXHTML(string text)

Intelligently converts a text block to XHTML

The text is converted as follows:

Parameters:
text - the text block to convert to XHTML.
Returns:
the text block converted to XHTML.

/Swat/SwatString.php at line 1534

validateUtf8

public static boolean validateUtf8(string string)

Checks whether or not a string is valid UTF-8

UTF-8 validation is adapted from Toby Inkster's PHP UTF-8 Validation Library found at http://tobyinkster.co.uk/blog/2007/03/21/utf8-validation/. Like Swat, Toby's library is available under the LGPL v2.1 or later.

Unlike Toby's validation function, this method does not modify the original string.

Limitations:

Note that in UTF-8, most characters have several alternative representations. RFC 3629 says that the shortest representation is the correct one. Other representations ("overlong forms") are not valid. Earlier UTF-8 specifications did not prohibit overlong forms, though suggest emitting a warning when one is encountered. This function does not check for overlong forms!

Parameters:
string - the string to check.
Returns:
true if the string is valid UTF-8 and false if it is not.
Author:
Michael Gauthier
Toby Inkster
Copyright:
2007 silverorange
2007 Toby Inkster

silverorange Code