What types of encoding does PagePix support?
Posted by Brandon Elliott on 16 December 2012 11:41 PM
We support UTF-8 encoded URLs and International Domain Names (IDNs)*
Ex: http://homöopathie-altona.de (notice the German umlaut)
Urlencoding is optional but strongly recommended (Ex: German umlaut ö as %C3%B6)
PHP Example: urlencode('http://homöopathie-altona.de');
*We also support punycode alternatives to an IDN, but they are not necessary.
Our service supports most umlauts and foreign language characters.
Ex: å ä ö Å Ä Ö רי ל ו т в у й
Note: We do NOT support escaped encoding (Ex: German umlaut ö as %F6)
Note: We have seen, in many years of running this service, that some customers send URLs that already contained encoded elements and THEN they urlencode them when sending. The best practice is to urldecode URLs before saving them, and then urlencode them when requesting screenshots from our service. However, we will double decode all URLs for maximum compatibility. This makes the problem transparent and yields a good screenshot result. IF, however, the URL is still encoded after a double decode, we will return an INVALID_ENCODING error response.
Note: Our service supports UTF-8 characters, but odd results may occur if your non-UTF-8 web page encoding and end-user browsers combine to make invalid requests. Read more about that on our page on UTF-8 compliant requests. The invalid requests submitted under this scenario are blocked by our system, but if any were to make it through, they would result in a HASH_MISMATCH error.
Capturing non-English Web Page Screenshots
We support a wide variety of languages and fonts, but I am certain that we do not support them all. If you find that a web page does not render the text / characters properly, you may open a support ticket to request that we add support for your chosen language. Please note that adding language support is dependent upon Operating System vendors and their community.
Output as a PDF File
We also support a wide variety of languages and fonts for PDF File output. Most web pages should render non-English characters properly.