使用PowerShell删除顶部的文本文件

我想在导入它们之前删除大约5000个文本文件的第一行。

我还是很新的PowerShell，所以不知道要search什么或如何处理这个。我目前的概念使用伪代码：

set-content file (get-content unless line contains amount)

但是，我似乎无法弄清楚如何做像包含的东西。

这不是世界上最高效的，但这应该工作：

 get-content $file | select -Skip 1 | set-content "$file-temp" move "$file-temp" $file -Force

虽然我真的很佩服@hoge的答案，一个非常简洁的技术和一个包装函数来概括它，我鼓励upvotes它，我不得不评论其他两个答案，使用临时文件（它啃我像指甲在黑板上！）。

假设文件不是很大，可以强制pipe道以分立的部分运行 – 从而避免了临时文件的需要 – 明智地使用了括号：

 (Get-Content $file | Select-Object -Skip 1) | Set-Content $file

…或简称：

 (gc $file | select -Skip 1) | sc $file

使用variables符号，你可以做到没有临时文件：

 ${C:\file.txt} = ${C:\file.txt} | select -skip 1 function Remove-Topline ( [string[]]$path, [int]$skip=1 ) { if ( -not (Test-Path $path -PathType Leaf) ) { throw "invalid filename" } ls $path | % { iex "`${$($_.fullname)} = `${$($_.fullname)} | select -skip $skip" } }

我的解决scheme是使用更多的.NET方法： StreamReader + StreamWriter 。看到这个答案是一个很好的答案，讨论perf： 在Powershell中，什么是按loggingtypes分割大型文本文件的最有效方法？

以下是我的解决scheme。是的，它使用一个临时文件，但在我的情况下，这并不重要（这是一个巨大的SQL表创build和插入语句文件）：

 PS> (measure-command{ $i = 0 $ins = New-Object System.IO.StreamReader "in/file/pa.th" $outs = New-Object System.IO.StreamWriter "out/file/pa.th" while( !$ins.EndOfStream ) { $line = $ins.ReadLine(); if( $i -ne 0 ) { $outs.WriteLine($line); } $i = $i+1; } $outs.Close(); $ins.Close(); }).TotalSeconds

它返回：

 188.1224443

受到AASoft的回答的启发，我出去改进了一下：

避免循环variables$i并在每个循环中与0进行比较
将执行结果try..finally到try..finally块中以始终closures正在使用的文件
使解决scheme工作的任意数量的行从文件的开始删除
使用variables$p来引用当前目录

这些更改导致以下代码：

 $p = (Get-Location).Path (Measure-Command { # Number of lines to skip $skip = 1 $ins = New-Object System.IO.StreamReader ($p + "\test.log") $outs = New-Object System.IO.StreamWriter ($p + "\test-1.log") try { # Skip the first N lines, but allow for fewer than N, as well for( $s = 1; $s -le $skip -and !$ins.EndOfStream; $s++ ) { $ins.ReadLine() } while( !$ins.EndOfStream ) { $outs.WriteLine( $ins.ReadLine() ) } } finally { $outs.Close() $ins.Close() } }).TotalSeconds

第一次更改将我的60 MB文件的处理时间从5.3s 4s 5.3s到了4s 。其余的变化更美观。

我刚刚从一个网站了解到：

 Get-ChildItem *.txt | ForEach-Object { (get-Content $_) | Where-Object {(1) -notcontains $_.ReadCount } | Set-Content -path $_ }

或者你可以使用别名来缩短它，比如：

 gci *.txt | % { (gc $_) | ? { (1) -notcontains $_.ReadCount } | sc -path $_ }

skip'不起作用，所以我的解决方法是

 $LinesCount = $(get-content $file).Count get-content $file | select -Last $($LinesCount-1) | set-content "$file-temp" move "$file-temp" $file -Force

 $x = get-content $file $x[1..$x.count] | set-content $file

就这么多。冗长的解释如下。 Get-content返回一个数组。我们可以“索引”数组variables，如本文和其他脚本专家的post所示。

例如，如果我们像这样定义一个数组variables，

 $array = @("first item","second item","third item")

所以$数组返回

 first item second item third item

那么我们可以“索引到”该数组来检索其第一个元素

 $array[0]

或只有第二

 $array[1]

或从第二个到最后一个索引值的范围。

 $array[1..$array.count]

对于较小的文件，你可以使用这个：

＆C：\ windows \ system32 \ more +1 oldfile.csv> newfile.csv | 出空

…但在处理我的16MB示例文件时效率不高。它似乎并没有终止并释放newfile.csv上的locking。

使用PowerShell删除顶部的文本文件

select对象后无法暂停或hibernate

如何在grep中做一个非贪婪的匹配？

在Bash脚本中获取当前目录名称（没有完整path）

如何检查文件是否包含使用bash的特定string

通过在Bash中的string数组循环？

$$在shell中意味着什么？

在emacs ansi-term shell中复制/粘贴

PSCustomObject到哈希表

shell脚本中“=〜”运算符的含义

Vim：将选定的文本传递给shell cmd，并在vim info /命令行上接收输出