分类目录归档:语言 & 开发

语言 & 开发

PowerShell提速和多线程

本文目录

PowerShell性能优化系列文章

  1. PowerShell优化和性能测试
  2. 让你的PowerShell For循环提速四倍
  3. 优化PowerShell的性能和内存消耗
  4. PowerShell提速和多线程
  5. 优化PowerShell脚本的几个小技巧
  6. 轻量级的PowerShell性能测试

本篇文章本是源自PowerShell.Com上的一个教程视频,讲师为Dr. Tobias Weltner。有时间的朋友可以直接去看英文视频。我英文水平有限,还没到达单靠纯听力就能给视频加中文字幕的能力,所以就把原视频中的观点与例子分享出来。

概述

我们平时写脚本时,经常会提醒自己要多使用管道,要多使用流模式,少占内存,少占CPU。但是这篇文章会反其道而行之,少用管道,通过内存和CPU的占用来提高效率,也就是我们通常算法上说的用空间来换取时间。机器配置高,有的用,而不用就是浪费。

比如下面的场景:

  • 写一个大文件可能需要3.6分钟,提高性能后,只需3秒钟
  • 读一个大文件可能需要77秒钟,提高性能后,只需2秒钟
  • 检查250台机器的是否在线,需要23.2分钟,提高性能后,只需26秒钟
PowerShell性能提高前后对比图

PowerShell性能提高前后对比图

这一切性能的提升都是有偿的,需要你额外的投资。

投资更多内存

在PowerShell中推崇的管道主要是为了限制内存的使用量,让管道中的元素像流水线中的零件或者半成品一样,从车间一个一个穿过。但是管道并不是最快的,且看下面的随机数的例子。

管道流模式可以节省内存

管道流模式可以节省内存

随机数的例子

12345678910#这个很快PS> 1..100 |Get-Random90 #这个稍有延迟,可以容忍PS> 1..10000 |Get-Random4868 #这个慢的受不了,所以直接Ctrl-C取消。PS> 1..1000000 |Get-Random

是不是就意味着1000000这么大的一个数组Get-Random天生就这么慢,非也。换个方法:

123#这样做,快的一塌糊涂啊PS> Get-Random -InputObject (1..1000000)317486

原因是前者使用了管道,产生一条数据,流过一条数据。而后者是直接一次性产生全部数据,然后交给Get-Random,所以快。

写文件的例子

123$all = @('Some Test' * 20)*1000000$file "$Env:TEMP\testfile.txt"$all Out-File $file

上面的脚本执行大概需要215 秒钟(3.6分钟)
换个方式,使用inputObject,仅用了101秒钟(1.8分钟)

1Out-File $file -InputObject $all

是不是优化就止于此了呢,不,下面的结果会亮瞎我的眼睛啊,只需要2.5秒钟

1[Io.file]::WriteAllLines( $file$all,[text.encoding]::Unicode)

真真没想到直接调用.NET方法差别会这么大,但是我用的是PowerShell,你给我整.NET方法,用键盘敲起来未免坑爹啊,那就试试Set-Content吧,只需要3.1秒钟

1Set-Content $file -Value $all -Encoding Unicode

与WriteAllLine相比,稍慢,也可以忍受。

通过这个写文件的例子对比不难发现,3.6分钟/3.1秒钟=70,,速度几乎提高了70倍啊。

总结一下,写文件时注意两点即可:

  1. 不要使用管道。
  2. 不要使用神马的Out-这样的命令(因为它会做格式化)

读文件的例子

对于我们刚才创建的文本文件,一般读取时,我们习惯使用Get-Content,需要77秒钟(1.3分钟)

1Get-Content $file

而如果使用.NET方法,只需1.8秒钟

1[io.file]::ReadAllLines($file)

Get-Content为什么会这么慢,因为它要一行一行来读,并把读取的数据存储成数组,所以慢。但是Get-Content有一个参数ReadCount,把值设为0,一次性全部读取,只需2.2秒钟

1Get-Content $file -ReadCount 0

把文件读出来一般是需要处理的,如果这样肯定不是你预期的,因为只会输出一个X:

1Get-Content $file -ReadCount 0 | foreach {"X" }

这样就对了,只需要2.7秒钟

12$text Get-Content $file -ReadCount 0foreach ($line in $text) {'X'}

千万不要多次一举,引入管道,得68秒钟啊:

12$text Get-Content $file -ReadCount 0 | foreach {"X" }$text ForEach-Object {'X'}

警告

  • cmdlet明显非常慢
  • .NET一些底层的方法相对较快

解药

  • 不要轻易引入管道。
  • 尽量使用传统的For或者foreach循环

如果你还不相信,请继续看例子。

多用循环,少用管道的例子

11..1000000  | ForEach-Object "looping for the $_ Time"}

上面使用管道,执行时间为6.9秒钟。如果换成简单的For循环,只需要0.5秒钟,速度提高了14倍。

1234For $x=1; $x -le 100000;$x++){"Looping for the $x. time"}

再看一个抑制输出的例子,三个写法效果一样,速度相差几十倍。

12345678#耗时0.1毫秒'Hello' |out-null #耗时 0.002毫秒,速度提高了56倍$null'Hello' #耗时0.0025毫秒,速度也很快[void] 'Hello'

投资更多CPU

PowerShell默认是单线程的执行的,只能一行命令接着一行命令来执行。我们可以使用PowerShell的后台Job来提高效率:对于批量任务,启用多个后台任务去处理,使用wait-job等待所有的任务结束。

多个后台任务批处理

多个后台任务批处理

先看一个顺序执行的例子。

12345678910111213$start Get-Date$code1 = { Start-Sleep -Seconds 5; 'A' }$code2 = { Start-Sleep -Seconds 6; 'B'}$code3 = { Start-Sleep -Seconds 7; 'C'} $result1,$result2,$result3= (& $code1),(& $code2),(& $code3) $end =Get-Date$timespan$end $start$seconds $timespan.TotalSeconds Write-Host "总耗时 $seconds 秒."Write-Host "三个脚本块总共延时 18 秒"

输出为(耗时18秒钟):

总耗时 18.0240865 秒.
三个脚本块总共延时 18 秒

同样的任务,使用后台Job多线程执行:

123456789101112131415161718$start Get-Date$code1 = { Start-Sleep -Seconds 5; 'A' }$code2 = { Start-Sleep -Seconds 6; 'B'}$code3 = { Start-Sleep -Seconds 7; 'C'} $job1 Start-Job -ScriptBlock $code1$job2 Start-Job -ScriptBlock $code2$job3 Start-Job -ScriptBlock $code3 $alljobs =  Wait-Job $job1,$job2,$job3$result1,$result2,$result3 Receive-Job $alljobs $end =Get-Date $timespan$end $start$seconds $timespan.TotalSecondsWrite-Host "总耗时 $seconds 秒."Write-Host "三个脚本块总共延时 18 秒"

输出为(耗时10秒钟):

总耗时 10.3778469 秒.
三个脚本块总共延时 18 秒

效率提升很明显。

使用后台Job的开销

  1. 每一个新的任务执行时都会使用一个新PowerShell进程。(所以所谓的多线程并不是真正的多线程,而是工作在进程级别上)
  2. 每一个任务的结果需要序列化后,跨进程传递给调度的主进程。
  3. 没有节流机制(所以要注意控制后台Job的数量)。
计算后台Job的开销

使用这段示例脚本:

1234567891011121314151617181920212223242526272829303132333435363738# (C) 2012 Dr. Tobias Weltner# you may freely use this code for commercial or non-commercial purposes at your own risk# as long as you credit its original author and keep this comment block.# For PowerShell training or PowerShell support, feel free to contact tobias.weltner@email.de $code = {$begin Get-Date$result Get-Process$end Get-Date $begin$end# play here by reducing the returned data,# i.e. use select-object to pick specific properties:$result} $start Get-Date $job Start-Job -ScriptBlock $code$null Wait-Job $job$completed Get-Date $result Receive-Job $job$received Get-Date $spinup $result[0]$exit $result[1] $timeToLaunch = ($spinup $start).TotalMilliseconds$timeToExit = ($completed $exit).TotalMilliseconds$timeToRunCommand = ($exit $spinup).TotalMilliseconds$timeToReceive = ($received $completed).TotalMilliseconds '{0,-30} : {1,10:#,##0.00} ms' -f 'Time to set up background job'$timeToLaunch'{0,-30} : {1,10:#,##0.00} ms' -f 'Time to run code'$timeToRunCommand'{0,-30} : {1,10:#,##0.00} ms' -f 'Time to exit background job'$timeToExit'{0,-30} : {1,10:#,##0.00} ms' -f 'Time to receive results'$timeToReceive

脚本会在后台的Job的启动,运行,结束和接受数据每个阶段设置时间戳,然后计算各个阶段耗费的时间。

第一次运行时,输出结果为:

Time to set up background job  :     270.01 ms
Time to run code               :      10.00 ms
Time to exit background job    :   1,550.06 ms
Time to receive results        :      10.00 ms

主要的延迟在Job退出时,因为上面的Job返回了大量的数据,假如我们注释掉上面示例代码的第15行,再执行一遍:

Time to set up background job  :     350.01 ms
Time to run code               :      10.00 ms
Time to exit background job    :      10.00 ms
Time to receive results        :       0.00 ms

执行效率明显提高,但是绝大多数的后台Job应当都会返回数据的,哪怕返回一点,所以我们把上面脚本的第15改成这样,再执行一遍:

1$result select-object Name,CPU
Time to set up background job  :     420.02 ms
Time to run code               :      10.00 ms
Time to exit background job    :     100.01 ms
Time to receive results        :       0.00 ms

稍有延迟,可以忍受。

结论

通过这个例子主要是告诉大家,影响后台任务的关键因素是返回的数据量,如果没有特别的需求,尽量在后台Job中不返回数据,或者少返回数据。

从进程间迁移到进程内

正如后台Job为用户所诟病的那样,它不是真正的多线程,而是多进程。所以下面我们开始从进程间迁移到进程内,使用实至名归的PowerShell多线程,因为它会在PowerShel.exe内部创建一个新的线程。

使用进程内多线程的优点

  • 不需要新的宿主进程
  • 不需要序列化结果
  • 线程内通信方便
  • 运行空间池提供了自动内存节流
开启一个线程

先看一个简单的在PowerShell中开启一个同步线程的例子

PS>  # Running New Thread Synchronously:
PS> $code = { Start-Sleep -Seconds 2; "Hello" }
PS> $newPowerShell = [PowerShell]::Create().AddScript($code)
PS> $newPowerShell.Invoke()
Hello
让线程异步运行

稍加改动,使用BeginInvoke()异步执行,使用EndInvoke()返回线程的数据:

123456789101112$code = {Start-Sleep -Seconds 2; "Hello"} $newPowerShell [PowerShell]::Create().AddScript($code)$handle $newPowerShell.BeginInvoke() while ($handle.IsCompleted -eq $false) {Write-Host '.' -NoNewlineStart-Sleep -Milliseconds 500} Write-Host ''$newPowerShell.EndInvoke($handle)

输出示例,先有原点的进度条

PS>
.....
Hello
演示一个进度提示器
123456789101112131415161718192021function Start-Progress {param([ScriptBlock]$code) $newPowerShell [PowerShell]::Create().AddScript($code)$handle $newPowerShell.BeginInvoke() while ($handle.IsCompleted -eq $false) {Write-Host '.' -NoNewlineStart-Sleep -Milliseconds 500} Write-Host '' $newPowerShell.EndInvoke($handle) $newPowerShell.Runspace.Close()$newPowerShell.Dispose()}

记得要在运行空间使用结束后,调用Close和Dispose方法释放资源。

先显示进度信息,然后返回结果。

PS> Start-Progress -code {Get-HotFix}
..

Source        Description      HotFixID      InstalledBy          InstalledOn
------        -----------      --------      -----------          -----------
ETS-V-TEST-01 Update           KB2899189_... NT AUTHORITY\SYSTEM  5/14/2014 12:00:00 AM
ETS-V-TEST-01 Update           KB2919355     ETS-V-TEST-01\Adm... 3/18/2014 12:00:00 AM
ETS-V-TEST-01 Update           KB2919442     ETS-V-TEST-01\Adm... 3/18/2014 12:00:00 AM
ETS-V-TEST-01 Security Update  KB2920189     NT AUTHORITY\SYSTEM  5/14/2014 12:00:00 AM
ETS-V-TEST-01 Security Update  KB2926765     NT AUTHORITY\SYSTEM  5/15/2014 12:00:00 AM
ETS-V-TEST-01 Security Update  KB2931366     NT AUTHORITY\SYSTEM  5/14/2014 12:00:00 AM

这个执行起来太快了,换一个慢一点的命令,效果更明显。

PS> Start-Progress -code {Get-WmiObject -Class Win32_product}
................................................
IdentifyingNumber : {90150000-0015-0409-0000-0000000FF1CE}
Name              : Microsoft Access MUI (English) 2013
Vendor            : Microsoft Corporation
Version           : 15.0.4569.1506
Caption           : Microsoft Access MUI (English) 2013

IdentifyingNumber : {90150000-0115-0409-0000-0000000FF1CE}
Name              : Microsoft Office Shared Setup Metadata MUI (English) 2013
Vendor            : Microsoft Corporation
Version           : 15.0.4569.1506
Caption           : Microsoft Office Shared Setup Metadata MUI (English) 2013
演示***
123456789101112131415161718192021222324252627function Start-Timebomb {param([Int32]$Seconds, [ScriptBlock]$Action = { Stop-Process -Id $PID }) $Wait "Start-Sleep -seconds $seconds"$script:newPowerShell [PowerShell]::Create().AddScript($Wait).AddScript($Action)$handle $newPowerShell.BeginInvoke()Write-Warning "Timebomb is active and will go off in $Seconds seconds unless you call Stop-Timebomb before."} function Stop-Timebomb {if $script:newPowerShell -ne $null) {Write-Host 'Trying to stop timebomb...' -NoNewline$script:newPowerShell.Stop()$script:newPowerShell.Runspace.Close()$script:newPowerShell.Dispose()Remove-Variable newPowerShell -Scope scriptWrite-Host 'Done!'else {Write-Warning 'No timebomb found.'}}

在控制台上开启了***后,如果没有及时停止,倒计时结束后,控制台会自动关闭。

PS> Start-Timebomb -Seconds 10
WARNING: Timebomb is active and will go off in 10 seconds unless you call Stop-Timebomb before.
监控脚本的执行时间

如果你想让倒计时的信息显示在控制台的标题栏,只需要修改上面的脚本第10行,修改成:

1$Wait "1..$seconds | foreach-object {start-sleep -seconds 1;  [console]::Title=""`$($Seconds-`$_) seconds remaining`"}"
监控脚本的运行内存

如果一个脚本运行时,占用的内存超过了限制,就自动终结掉这个进程。

123456789101112131415161718192021222324252627282930313233343536373839404142function Start-TimebombMemory {param([Int32]$MemoryMB=30, [ScriptBlock]$Action = { Stop-Process -Id $PID }) $Wait '$initial = (Get-Process -Id $PID).WorkingSet$threshold = (XXX * 1MB)do  {$memory = ((Get-Process -Id $PID).WorkingSet - $initial)Start-Sleep -Seconds 1[system.Console]::Title = ("Current Memory Load: {0:0.00} MB. Threshold: XXX MB" -f ($memory/1MB))} while ($memory -lt $threshold) $message1 = "Shell is using {0:0.0} MB which is exceeding the threshold by {1:0.0} MB." -f ($memory/1MB), (($memory-$threshold)/1MB)$message2 = "Shell will be aborted in 5 seconds. There is nothing you can do about it, sorry."[System.Console]::WriteLine($message1)[System.Console]::WriteLine($message2)Start-Sleep -Seconds 5' -replace 'XXX'$MemoryMB $script:newPowerShellMB [PowerShell]::Create().AddScript($Wait).AddScript($Action)$handle $newPowerShellMB.BeginInvoke()Write-Warning "Timebomb is active and will go off when the shell uses more than $memoryMB MB -  unless you call Stop-Timebomb before."} function Stop-TimebombMemory {if $script:newPowerShellMB -ne $null) {Write-Host 'Trying to stop timebomb...' -NoNewline$script:newPowerShellMB.Stop()$script:newPowerShellMB.Runspace.Close()$script:newPowerShellMB.Dispose()Remove-Variable newPowerShellMB -Scope scriptWrite-Host 'Done!'else {Write-Warning 'No timebomb found.'}}

执行了Start-TimebombMemory后,会在PowerShell的控制台动态显示当前PowerShell进程占用的内存和阈值,如果内存占用超标,打印信息提示用户,并在5秒钟后自动关闭当前进程。

动态监控脚本的运行内存

动态监控脚本的运行内存

创建一个STA模式的线程

你写了一个函数,调用winform的OpenFileDialog来打开文件选择对话框。很不幸如果当前的控制台运行在MTA模式下,则对话框不能显示。所以为了增强兼容性,给你的函数单独指定一个线程运行,因为在运行空间中可以指定Apartment State。具体看下面的代码:

1234567891011121314151617181920212223242526272829303132333435363738function Show-OpenFileDialog {param([string]$Title='Select a file',[string]$Path=$home,[string]$Filter "All Files (*.*)|*.*")  $code = {param([string]$Title,[string]$Path,[string]$Filter "All Files (*.*)|*.*") Add-Type -AssemblyName System.Windows.Forms $DialogOpen New-Object System.Windows.Forms.OpenFileDialog$DialogOpen.InitialDirectory = $Path$DialogOpen.Filter $Filter$DialogOpen.Title = $Title$Result $DialogOpen.ShowDialog()if ($Result -eq "OK"){$DialogOpen.FileName}} $newRunspace [RunSpaceFactory]::CreateRunspace()$newRunspace.ApartmentState = 'MTA'$newRunspace.Open()$newPowerShell [PowerShell]::Create()$newPowerShell.Runspace = $newRunspace[void]$newPowerShell.AddScript($code).AddArgument($Title).AddArgument($Path).AddArgument($Filter)$newPowerShell.Invoke()$newPowerShell.Runspace.Close()$newPowerShell.Dispose()}
多线程中的关键组件
  • PowerShell:代表线程
  • RunSpace:代表Powershell会话
  • BeginInvoke():返回等待的句柄
  • EndInvoke():返回结果对象
  • MTA和STA模式可以完全控制
  • 每次执行完毕后,记得释放RunSpace,销毁线程。
启用节流
  1. 创建一个RunSpace 池
  2. 使用RunSpace池代替Runspace
  3. 在池中控制活动的RunSpace个数
演示简单的运行空间池

限定活动的线程最多为5,这样当尝试开启40个线程时,并不是一下子开启,而是排队等候空闲的线程池,每次最多只能有5个活动的线程池。

12345678910111213141516$throttleLimit = 5$iss [system.management.automation.runspaces.initialsessionstate]::CreateDefault()$Pool [runspacefactory]::CreateRunspacePool(1, $throttleLimit$iss$Host)$Pool.Open() $ScriptBlock = {param($id)Start-Sleep -Seconds 2[System.Console]::WriteLine("Done processing ID $id")} for ($x = 1; $x -le 40; $x++) {$powershell ::Create().AddScript($ScriptBlock).AddArgument($x)$powershell.RunspacePool = $Pool$handle $powershell.BeginInvoke()}
从多线程中接受数据
1234567891011121314151617181920212223242526272829303132333435363738$throttleLimit = 4$SessionState [system.management.automation.runspaces.initialsessionstate]::CreateDefault()$Pool [runspacefactory]::CreateRunspacePool(1, $throttleLimit$SessionState$Host)$Pool.Open() $ScriptBlock = {param($id) Start-Sleep -Seconds 2"Done processing ID $id"} $threads = @() $handles for ($x = 1; $x -le 40; $x++) {$powershell ::Create().AddScript($ScriptBlock).AddArgument($x)$powershell.RunspacePool = $Pool$powershell.BeginInvoke()$threads += $powershell} do {$i = 0$done $trueforeach ($handle in $handles) {if ($handle -ne $null) {if ($handle.IsCompleted) {$threads[$i].EndInvoke($handle)$threads[$i].Dispose()$handles[$i] = $nullelse {$done $false}}$i++}if (-not $done) { Start-Sleep -Milliseconds 500 }until ($done)

声明:本文所有观点,图片,示例脚本引用自 Dr. Tobias Weltner的视频教程。感谢 Dr. Tobias Weltner!
示例脚本出处demofiles_multithreading
视频出处Speeding Up PowerShell (网盘:密码: 7w21 )

WebConfig特殊字符的转义

Web.Config默认编码格式为UTF-8,对于XML文件,要用到实体转义码来替换。对应关系如下:

字符

转义码

& 符号 & &
单引号 '
双引号 "
大于 > >
小于 < &lt;
 

App.config:

 

 <?xml version="1.0" encoding="utf-8" ?>
 <configuration>
     <startup> 
         <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.5" />
     </startup>
   <connectionStrings>
     <add name="DBconnString" connectionString="Data Source=.;Initial Catalog=MyTest123456;User ID=sa;PassWord=123&456"/>
   </connectionStrings>
 </configuration>

 

由于数据库连接的密码中含有特殊字符”&”,编译时出现如下如下错误信息:

显然编译器不认识”&456″,怎么解决呢,总不能更换密码吧?

事实上App.config是xml文件,在xml文件中特殊字符要进行HTML转义。

HTML中<,>,&等有特殊含义(<,>,用于链接签,&用于转义),不能直接使用。这些符号是不显示在我们最终看到的网页里的,那如果我们希望在网页中显示这些符号,就要用到HTML转义字符串(Escape Sequence)了

 


HTML特殊转义字符列表

 

最常用的字符实体

显示     说明     实体名称     实体编号

 

         空格     &nbsp;     &#160;

<         小于     &lt;      &#60;

>      大于    &gt;      &#62;

&      &符号    &amp;     &#38;

“      双引号    &quot;     &#34;

©      版权    &copy;    &#169;

®    已注册商标    &reg;    &#174;

™    商标(美国)    ™    &#8482;

×    乘号      &times;    &#215;

÷     除号      &divide;    &#247;

 

所以只要把”&”进行转义就可以了,将PassWord改为PassWord=123&amp;456″成功通过编译。

 

Python日志监控系统处理日志(pyinotify)

前言

最近项目中遇到一个用于监控日志文件的Python包pyinotify,结合自己的项目经验和网上的一些资料总结一下,总的原理是利用pyinotify模块监控日志文件夹,当日志到来的情况下,触发相应的函数进行处理,处理完毕后删除日志文件的过程,下面就着重介绍下pyinotify

pyinotify

Pyinotify是一个Python模块,用来监测文件系统的变化。 Pyinotify依赖于Linux内核的功能—inotify(内核2.6.13合并)。 inotify的是一个事件驱动的通知器,其通知接口通过三个系统调用从内核空间到用户空间。pyinotify结合这些系统调用,并提供一个顶级的抽象和一个通用的方式来处理这些功能。

  • pyinotify 说百了就是通过 调用系统的inotify来实现通知的
  • inotify 既可以监视文件,也可以监视目录
  • Inotify 使用系统调用而非 SIGIO 来通知文件系统事件。

Inotify 可以监视的文件系统事件包括:

Event NameIs an EventDescription
IN_ACCESSYesfile was accessed.
IN_ATTRIBYesmetadata changed.
IN_CLOSE_NOWRITEYesunwrittable file was closed.
IN_CLOSE_WRITEYeswrittable file was closed.
IN_CREATEYesfile/dir was created in watched directory.
IN_DELETEYesfile/dir was deleted in watched directory.
IN_DELETE_SELFYes自删除,即一个可执行文件在执行时删除自己
IN_DONT_FOLLOWNodon’t follow a symlink (lk 2.6.15).
IN_IGNOREDYesraised on watched item removing. Probably useless for you, prefer instead IN_DELETE*.
IN_ISDIRNoevent occurred against directory. It is always piggybacked to an event. The Event structure automatically provide this information (via .is_dir)
IN_MASK_ADDNoto update a mask without overwriting the previous value (lk 2.6.14). Useful when updating a watch.
IN_MODIFYYesfile was modified.
IN_MOVE_SELFYes自移动,即一个可执行文件在执行时移动自己
IN_MOVED_FROMYesfile/dir in a watched dir was moved from X. Can trace the full move of an item when IN_MOVED_TO is available too, in this case if the moved item is itself watched, its path will be updated (see IN_MOVE_SELF).
IN_MOVED_TOYesfile/dir was moved to Y in a watched dir (see IN_MOVE_FROM).
IN_ONLYDIRNoonly watch the path if it is a directory (lk 2.6.15). Usable when calling .add_watch.
IN_OPENYesfile was opened.
IN_Q_OVERFLOWYesevent queued overflowed. This event doesn’t belongs to any particular watch.
IN_UNMOUNTYes宿主文件系统被 umount
IN_ACCESS,即文件被访问
IN_MODIFY,文件被write
IN_ATTRIB,文件属性被修改,如chmod、chown、touch等
IN_CLOSE_WRITE,可写文件被close
IN_CLOSE_NOWRITE,不可写文件被close
IN_OPEN,文件被open
IN_MOVED_FROM,文件被移走,如mv
IN_MOVED_TO,文件被移来,如mv、cp
IN_CREATE,创建新文件
IN_DELETE,文件被删除,如rm
IN_DELETE_SELF,自删除,即一个可执行文件在执行时删除自己
IN_MOVE_SELF,自移动,即一个可执行文件在执行时移动自己
IN_UNMOUNT,宿主文件系统被umount
IN_CLOSE,文件被关闭,等同于(IN_CLOSE_WRITE | IN_CLOSE_NOWRITE)
IN_MOVE,文件被移动,等同于(IN_MOVED_FROM | IN_MOVED_TO)

pyinotify使用例子

#!/usr/bin/python
# coding:utf-8
 
import os
from pyinotify import WatchManager, Notifier,ProcessEvent,IN_DELETE, IN_CREATE,IN_MODIFY
   
class EventHandler(ProcessEvent):
 """事件处理"""
 def process_IN_CREATE(self, event):
  print  "Create file: %s " %  os.path.join(event.path,event.name)
   
 def process_IN_DELETE(self, event):
  print  "Delete file: %s " %  os.path.join(event.path,event.name)
  
 def process_IN_MODIFY(self, event):
   print  "Modify file: %s " %  os.path.join(event.path,event.name)
   
def FSMonitor(path='.'):
  wm = WatchManager() 
  mask = IN_DELETE | IN_CREATE |IN_MODIFY
  notifier = Notifier(wm, EventHandler())
  wm.add_watch(path, mask,auto_add=True,rec=True)
  print 'now starting monitor %s'%(path)
  while True:
   try:
     notifier.process_events()
     if notifier.check_events():
       notifier.read_events()
   except KeyboardInterrupt:
     notifier.stop()
     break
   
if __name__ == "__main__":
 FSMonitor('/root/softpython/apk_url')

用pip命令安装第三方包出现retrying且ssl error问题汇总

今天pip包时一直retrying且报ssl error的错误,我弄了一上午才好,网上有很多解决方案,但是没有pip安装失败的汇总情况,如有同错,请对比以下情况,希望能解决你的问题,也烦请对不足之处指出。

一、

错误:CouldnotfetchURLhttps://pypi.python.org/simple/pytest-xdist/: There wasaproblem confirmingthessl certificate: [SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocolversion(_ssl.c:590)

原因:python.org已经不支持TLSv1.0和TLSv1.1,需要升级pip,但是pip用不了,所以手动升级

解决方案:

1.mac或者linux操作系统:在终端下执行命令:

curl https://bootstrap.pypa.io/get-pip.py | python3。

2.windows操作系统:从https://bootstrap.pypa.io/get-pip.py下载get-pip.py文件,然后使用python运行这个文件python get-pip.py

二、

错误:Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None))

after connection broken by ‘SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAI

LED] certificate verify failed (_ssl.c:661)’),)’

原因:可能用国外镜像源连接不好

解决方案:

1.mac或者linux操作系统:

修改 ~/.pip/pip.conf (没有就创建一个), 更换 index-url

[global]

index-url = http://pypi.douban.com/simple

2.windows操作系统:

直接在user目录中创建一个pip目录,如:C:\Users\xx\pip,新建文件pip.ini,内容同上

附:镜像源

阿里云 http://mirrors.aliyun.com/pypi/simple/

中国科技大学 https://pypi.mirrors.ustc.edu.cn/simple/

豆瓣 http://pypi.douban.com/simple

Python官方 https://pypi.python.org/simple/

v2ex http://pypi.v2ex.com/simple/

中国科学院 http://pypi.mirrors.opencas.cn/simple/

清华大学 https://pypi.tuna.tsinghua.edu.cn/simple/

三、

错误:Could not fetch URL https://pypi.org/simple/pip/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host=’pypi.org’, port=443): Max retries exceeded with url: /simple/pip/ (Caused by S

SLError(SSLError(1, ‘[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:833)’),)) – skipping

原因:在最新的 pip 版本(>=7)中,使用镜像源时,会提示源地址不受信任或不安全

解决方案1:

pip install –trusted-host pypi.org –trusted-host files.pythonhosted.org 包名

解决方案2:(推荐)

在第二个解决方案中:添加一项配置

[install]
trusted-host=http://pypi.douban.com/simple

理解 Python 装饰器

讲 Python 装饰器前,我想先举个例子,虽有点污,但跟装饰器这个话题很贴切。

每个人都有的内裤主要功能是用来遮羞,但是到了冬天它没法为我们防风御寒,咋办?我们想到的一个办法就是把内裤改造一下,让它变得更厚更长,这样一来,它不仅有遮羞功能,还能提供保暖,不过有个问题,这个内裤被我们改造成了长裤后,虽然还有遮羞功能,但本质上它不再是一条真正的内裤了。于是聪明的人们发明长裤,在不影响内裤的前提下,直接把长裤套在了内裤外面,这样内裤还是内裤,有了长裤后宝宝再也不冷了。装饰器就像我们这里说的长裤,在不影响内裤作用的前提下,给我们的身子提供了保暖的功效。 继续阅读

Makefiles Tutorial

GNU Make is key utility in building applications.It is available on any operating systems. Its main purpose is to determine automatically which pieces of a program need to be recompiled, and issue the commands to build them

While working with open source project, Make is the common tool because almost any IDE can work with it. Make is a very powerful tool – in one command you can issue a series of tasks and working with it can make developers life easier. 继续阅读

Python — imaplib IMAP example with Gmail

I couldn’t find all that much information about IMAP on the web, other than the RFC3501.

The IMAP protocol document is absoutely key to understanding the commands available, but let me skip attempting to explain and just lead by example where I can point out the common gotchas I ran into.

Logging in to the inbox

1
2
3
4
5
6
import imaplib
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login('myusername@gmail.com', 'mypassword')
mail.list()
# Out: list of "folders" aka labels in gmail.
mail.select("inbox") # connect to inbox.

Getting all mail and fetching the latest

Let’s start by searching our inbox for all mail with the search function.
Use the built in keyword “ALL” to get all results (documented in RFC3501).

We’re going to extract the data we need from the response, then fetch the mail via the ID we just received.

1
2
3
4
5
6
7
8
9
10
result, data = mail.search(None, "ALL")
ids = data[0] # data is a list.
id_list = ids.split() # ids is a space separated string
latest_email_id = id_list[-1] # get the latest
result, data = mail.fetch(latest_email_id, "(RFC822)") # fetch the email body (RFC822) for the given ID
raw_email = data[0][1] # here's the body, which is raw text of the whole email
# including headers and alternate payloads

Using UIDs instead of volatile sequential ids

继续阅读

How to:Python

1、django.core.exceptions.ImproperlyConfigured: Requested setting DEBUG, but settings are not configured.

You must either define the environment variable DJANGO_SETTINGS_MODULE or call settings.configure() before accessing settings.

>>> from django.conf import settings
>>> settings.configure()
>>> from django.test.utils import setup_test_environment
>>> setup_test_environment()

use MySQL on Django with Python3.6

My environment: Windows 10, Python 3.6, Django 1.11.4., PyMySQL 0.6.1, MySQL Server 5.7 on Windows

How to make it work:

  1. Install PyMySQL version 0.7.11 (https://github.com/PyMySQL/PyMySQL/): you can install it either by using pip, i.e. : pip install PyMySQL or by manually downloading the package; there is a good documentation on their website on how to do that.
  2. Open your Django App __init__.py and paste the following lines:
    import pymysql
    pymysql.install_as_MySQLdb() 
    
  3. Now, open settings.py and make sure your DATABASE property looks like this:
    DATABASES = {
       'default': {
           'ENGINE': 'django.db.backends.mysql',
           'NAME': 'mydb',
           'USER': 'dbuser',
           'PASSWORD': 'dbpassword',
           'HOST': 'dbhost',
           'PORT': '3306'
        }
    }
    
  4. That’s it, you should be able to execute python manage.py syncdb to init your MySQL DB; see the sample output below:
    Creating tables ...
    Creating table django_admin_log
    Creating table auth_permission
    Creating table auth_group_permissions
    ...
    ...
    Creating table socialaccount_socialtoken
    
    You just installed Django's auth system, which means you don't have any superusers defined.
    ...
    
  5. you should be able to execute python manage.py migrate to creates any necessary database tables according
    to the database settings in your mysite/settings.py file and the database migrations shipped with the app; see the sample output below:

    
    Operations to perform:
      Apply all migrations: admin, auth, contenttypes, sessions
    Running migrations:
      Applying contenttypes.0001_initial... OK
      Applying auth.0001_initial... OK
      Applying admin.0001_initial... OK
      Applying admin.0002_logentry_remove_auto_add... OK
      Applying contenttypes.0002_remove_content_type_name... OK
      Applying auth.0002_alter_permission_name_max_length... OK
      Applying auth.0003_alter_user_email_max_length... OK
      Applying auth.0004_alter_user_username_opts... OK
      Applying auth.0005_alter_user_last_login_null... OK
      Applying auth.0006_require_contenttypes_0002... OK
      Applying auth.0007_alter_validators_add_error_messages... OK
      Applying auth.0008_alter_user_username_max_length... OK
      Applying sessions.0001_initial... OK
    

File System Redirector

The %windir%\System32 directory is reserved for 64-bit applications. Most DLL file names were not changed when 64-bit versions of the DLLs were created, so 32-bit versions of the DLLs are stored in a different directory. WOW64 hides this difference by using a file system redirector.

In most cases, whenever a 32-bit application attempts to access %windir%\System32, the access is redirected to %windir%\SysWOW64. Access to %windir%\lastgood\system32 is redirected to %windir%\lastgood\SysWOW64. Access to %windir%\regedit.exe is redirected to %windir%\SysWOW64\regedit.exe.

If the access causes the system to display the UAC prompt, redirection does not occur. Instead, the 64-bit version of the requested file is launched. To prevent this problem, either specify the SysWOW64 directory to avoid redirection and ensure access to the 32-bit version of the file, or run the 32-bit application with administrator privileges so the UAC prompt is not displayed. 继续阅读