51CTO技术论坛 » Php » PHP截取字符串 包函HTML标志的也可截取 不会截取半个标签或者标签不成对        上一帖     下一帖    查看完整版本

页: [1]

yuanpengfei2007-8-17 09:02
PHP截取字符串 包函HTML标志的也可截取 不会截取半个标签或者标签不成对

最近需要在HTML的内容中提取一段文本作为简介,如果用普通的办法提取那么将有可能出现截取了半个HTML标志的情况,会破坏页面布局, 为此添加了这么一个函数,可完美解决此问题!!注意在这里你截取的字符数是不含HTML标志的!!
CODE:
<?php
/**
* 截取HTML字符串 允许忽略HTML标志不计
*Author:学无止境
* Email:[email]xjtdy888@163.com[/email]
* QQ: 339534039
* Home:[url]http://www.phpos.org[/url]
* Blog:[url]http://hi.baidu.com/phps[/url]
*转载请保留作者信息
*@param 要截取的HTML $str
* @param 截取的数量 $num
* @param 是否需要加上更多 $more
* @return 截取串
*/
function phpos_chsubstr_ahtml($str,$num,$more=false)
{
    $leng=strlen($str);
      if($num>=$leng)      return $str;
    $word=0;
    $i=0;                        /** 字符串指针 **/
    $stag=array(array());        /** 存放开始HTML的标志 **/
    $etag=array(array());        /** 存放结束HTML的标志 **/
    $sp = 0;
    $ep = 0;
      while($word!=$num)
      {
         if(ord($str[$i])>128)
          {
            //$re.=substr($str,$i,3);
            $i+=3;
            $word++;
          }
          else if ($str[$i]=='<')
          {
              if ($str[$i+1] == '!')
              {
                $i++;
                  continue;
              }
              if ($str[$i+1]=='/')   
              {
                $ptag=&$etag ;
                $k=&$ep;
                $i+=2;
              }
              else                    
              {
                $ptag=&$stag;
                $i+=1;
                $k=&$sp;
              }
             for(;$i<$leng;$i++)        
              {
                  if ($str[$i] == ' ')
                  {
                    $ptag[$k] = implode('',$ptag[$k]);
                    $k++;
                      break;
                  }
                  if ($str[$i] != '>')
                  {
                    $ptag[$k][]=$str[$i];
                      continue;
                  }
                  else               
                  {
                    $ptag[$k] = implode('',$ptag[$k]);
                    $k++;
                      break;
                  }
              }
            $i++;
              continue;
          }
          else
          {
            //$re.=substr($str,$i,1);
            $word++;
            $i++;
          }
      }
      foreach ($etag as $val)
      {
        $key1=array_search($val,$stag);
          if ($key1 !== false)          unset($stag[$key]);
      }
      foreach ($stag as $key => $val)
      {
          if (in_array($val,array('br','img'))) unset($stag[$key1]);
      }
    array_reverse($stag);
    $ends = '</'.implode('></',$stag).'>';
    $re = substr($str,0,$i).$ends;
      if($more)    $re.='...';
      return $re;
}

$str=<<<EOF
<h3>What is the <acronym>GNU</acronym> pr<a><a><a>oject?</h3>
<p>The <acronym>GNU</acronym> Project was launched in 1984 to develop a complete Unix-like operating system which is <a href="http://www.gnu.org/philosophy/free-sw.html">free software</a>: the <acronym>GNU</acronym> system. Variants of the <acronym>GNU</acronym> operating system, which use the kernel called Linux, are now widely used; though these systems are often referred to as &ldquo;Linux&rdquo;, they are more accurately called <a href="http://www.gnu.org/gnu/linux-and-gnu.html">GNU/Linux systems</a>. </p>
<p><acronym>GNU</acronym> is a recursive acronym for &ldquo;GNU's Not Unix&rdquo;; it is pronounced <em>guh-noo</em>, approximately like <em>canoe</em>.</p>
<h3>What is Free Software?</h3>
<p>&ldquo;<a href="http://www.gnu.org/philosophy/free-sw.html">Free software</a>&rdquo; is a matter of liberty, not price. To understand the concept, you should think of &ldquo;free&rdquo; as in &ldquo;free speech&rdquo;, not as in &ldquo;free beer&rdquo;.</p>
<p>Free software is a matter of the users' freedom to run, copy, distribute, study, change and improve the software. More precisely, it refers to four kinds of freedom, for the users of the software:</p>
<ul>
      <li>The freedom to run the program, for any purpose (freedom 0). </li>
      <li>The freedom to study how the program works, and adapt it to your needs (freedom 1). Access to the source code is a precondition for this. </li>
      <li>The freedom to redistribute copies so you can help your neighbor (freedom 2). </li>
      <li>The freedom to improve the program, and release your improvements to the public, so that the whole community benefits (freedom 3). Access to the source code is a precondition for this. </li>
</ul>
<h3>What is the Free Software Foundation?</h3>
<p>The <a href="http://www.fsf.org/">Free Software Foundation</a> (<abbr title="Free Software Foundation">) is the principal organizational sponsor of the Project. The receives very little funding from corporations or grant-making foundations, but relies on support from individuals like you. </abbr>) is the principal organizational sponsor of the Project. The receives very little funding from corporations or grant-making foundations, but relies on support from individuals like you. </p>
<p>Please consider helping the <abbr>by , or by . If you use Free Software in your business, you can also consider or as a way to support the . </abbr>by , or by . If you use Free Software in your business, you can also consider or as a way to support the . </p>
<p>The <acronym>GNU</acronym> project supports the mission of the <abbr>to preserve, protect and promote the freedom to use, study, copy, modify, and redistribute computer software, and to defend the rights of Free Software users. We support the on the Internet, , and the unimpeded by private monopolies. You can also learn more about these issues in the book . </abbr>to preserve, protect and promote the freedom to use, study, copy, modify, and redistribute computer software, and to defend the rights of Free Software users. We support the on the Internet, , and the unimpeded by private monopolies. You can also learn more about these issues in the book . </p>
<!--
Keep link lines at 72 characters or lynx will break them poorly.
Obviously, we list ONLY the most useful/important URLs here.
Keep it short and sweet: 3 lines and 2 columns is already enough
--><!-- BEGIN GNUmenu -->
EOF;
echo phpos_chsubstr_ahtml($str,800);
?>

关键词:        

相关文章:
赛迪头条:Google发布个性化搜索 可选属性保护隐私权
域网络的问题,..跪求高手指点!!!!
菜鸟宝典:局域网常见问题解答
大虾帮忙看看该怎么办
BI综合案例系列课程(7):企业管理中的复合报表设计与制作(视频+讲稿)

查看完整版本: PHP截取字符串 包函HTML标志的也可截取 不会截取半个标签或者标签不成对


Powered by 51CTO.COM