El mejor enfoque para leer archivos en VBA
Hace años, desarrollar una función simple para leer un archivo de ‘texto’ usando VBA para poder pasar el contenido del archivo ‘texto’ a una variable VBA para poder trabajar. Puede consultar ese artículo refiriéndose a:
Continuar leyendo
Más recientemente, haber usado a lo largo de los años una variedad de formas diferentes de leer archivos, tenía curiosidad si un enfoque era mejor (desde el punto de vista de rendimiento).
Los enfoques
Entonces, primero permítanme presentar brevemente los 3 enfoques principales que decidí usar y probar. Luego, tenga en cuenta que la mayoría también puede ofrecer una vinculación temprana de VS tardía, así como para emplear la codificación de la variable de objeto de autouración (SHOV) para una mayor optimización del rendimiento.
Flujo de ADODB
Una cosa a tener en cuenta con el enfoque de flujo ADODB es que puede manejar diferentes codificaciones de texto, por lo que esto también puede ser un factor en el enfoque que privilegue más allá de solo un rendimiento.
Codificación estándar
'--------------------------------------------------------------------------------------- ' Procedure : ADODB_ReadFileAsText ' Author : Daniel Pineault, CARDA Consultants Inc. ' Website : ' Purpose : This code opens a UTF-8 encoded text file and reads its entire contents ' Copyright : The following is release as Attribution-ShareAlike 4.0 International ' (CC BY-SA 4.0) - ' Req'd Refs: Late Binding -> None required ' Early Binding -> Microsoft ActiveX Data Objects x.x Library ' ' Input Variables: ' ~~~~~~~~~~~~~~~~ ' sFile : Fully qualified path and filename to read ' ' Usage: ' ~~~~~~ ' ADODB_ReadFileAsText("C:\Users\Dev\Desktop\Booking.json") ' Returns -> the content of the file in question ' ' Revision History: ' Rev Date(yyyy-mm-dd) Description ' ************************************************************************************** ' 1 2012-05-14 ' 2 2025-04-21 Initial Public Release '--------------------------------------------------------------------------------------- Function ADODB_ReadFileAsText(ByVal sFile As String) As String On Error GoTo Error_Handler #Const ADODBStream_EarlyBind = False 'Should normally be in the Module header #If ADODBStream_EarlyBind = True Then Dim oADODBStream As ADODB.stream Set oADODBStream = New ADODB.stream #Else Static oADODBStream As Object 'Const adTypeBinary = 1 Const adTypeText = 2 Set oADODBStream = CreateObject("ADODB.Stream") #End If With oADODBStream .Type = adTypeText .Charset = "utf-8" .Open .LoadFromFile sFile ADODB_ReadFileAsText = .ReadText .Close End With Error_Handler_Exit: On Error Resume Next Set oADODBStream = Nothing Exit Function Error_Handler: MsgBox "The following error has occurred" & vbCrLf & vbCrLf & _ "Error Source: ADODB_ReadFileAsText" & vbCrLf & _ "Error Number: " & Err.Number & vbCrLf & _ "Error Description: " & Err.Description & _ Switch(Erl = 0, "", Erl <> 0, vbCrLf & "Line No: " & Erl) _ , vbOKOnly + vbCritical, "An Error has Occurred!" Resume Error_Handler_Exit End Function
Codificación de variable de objeto de autocuración (shov)
' Req'd Refs: Late Binding -> None required ' Early Binding -> Microsoft ActiveX Data Objects x.x Library #Const ADODBStream_EarlyBind = False #If ADODBStream_EarlyBind = True Then Private pADODBStream As ADODB.stream #Else Private pADODBStream As Object #End If 'Self-healing oADODBStream property #If ADODBStream_EarlyBind = True Then Public Function oADODBStream() As ADODB.stream #Else Public Function oADODBStream() As Object #End If On Error GoTo Err_Handler If pADODBStream Is Nothing Then Debug.Print "************************* Setting oADODBStream ***" Beep #If ADODBStream_EarlyBind = True Then Set pADODBStream = New ADODB.stream #Else Set pADODBStream = CreateObject("ADODB.Stream") #End If End If Set oADODBStream = pADODBStream Exit_Procedure: Exit Function Err_Handler: MsgBox "The following error has occured" & vbCrLf & vbCrLf & _ "Error Number: " & Err.Number & vbCrLf & _ "Error Source: Property Get oADODBStream" & vbCrLf & _ "Error Description: " & Err.Description & _ Switch(Erl = 0, "", Erl <> 0, vbCrLf & "Line No: " & Erl) _ , vbOKOnly + vbCritical, "An Error has Occured!" Resume Exit_Procedure End Function Public Sub oADODBStream_Clear() 'Be sure to always run this when closing your Form/DB to avoid ' hidden instances from running in the background! Set pADODBStream = Nothing End Sub '--------------------------------------------------------------------------------------- ' Procedure : ADODB_ReadFileAsText_SHOV ' Author : Daniel Pineault, CARDA Consultants Inc. ' Website : ' Purpose : This code opens a UTF-8 encoded text file and reads its entire contents ' using SHOV. ' Copyright : The following is release as Attribution-ShareAlike 4.0 International ' (CC BY-SA 4.0) - ' Req'd Refs: Late Binding -> None required ' Early Binding -> Microsoft ActiveX Data Objects x.x Library ' ' Input Variables: ' ~~~~~~~~~~~~~~~~ ' sFile : Fully qualified path and filename to read ' ' Usage: ' ~~~~~~ ' ADODB_ReadFileAsText_SHOV("C:\Users\Dev\Desktop\Booking.json") ' Returns -> the content of the file in question ' ' Revision History: ' Rev Date(yyyy-mm-dd) Description ' ************************************************************************************** ' 1 2012-05-14 ' 2 2025-04-21 Initial Public Release '--------------------------------------------------------------------------------------- Function ADODB_ReadFileAsText_SHOV(ByVal sFile As String) As String On Error GoTo Error_Handler #If ADODBStream_EarlyBind = False Then Const adTypeText = 2 #End If With oADODBStream .Type = adTypeText .Charset = "utf-8" .Open .LoadFromFile sFile ADODB_ReadFileAsText_SHOV = .ReadText .Close End With Error_Handler_Exit: On Error Resume Next Exit Function Error_Handler: MsgBox "The following error has occurred" & vbCrLf & vbCrLf & _ "Error Source: ADODB_ReadFileAsText_SHOV" & vbCrLf & _ "Error Number: " & Err.Number & vbCrLf & _ "Error Description: " & Err.Description & _ Switch(Erl = 0, "", Erl <> 0, vbCrLf & "Line No: " & Erl) _ , vbOKOnly + vbCritical, "An Error has Occurred!" Resume Error_Handler_Exit End Function
Entrada de archivo
Como puede ver, mi función original ha evolucionado ligeramente a lo largo de los años.
'--------------------------------------------------------------------------------------- ' Procedure : FileInput_ReadFile ' Author : CARDA Consultants Inc. ' Website : ' Purpose : Read (text) file all into memory in a single shot rather than line by line ' Copyright : The following is release as Attribution-ShareAlike 4.0 International ' (CC BY-SA 4.0) - ' Req'd Refs: None required ' Input Variables: ' ~~~~~~~~~~~~~~~~ ' sFile : Full path and filename of the file that is to be read ' ' Usage: ' ~~~~~~ ' MyTxt = FileInput_ReadFile("c:\tmp\test.txt") ' MyTxt = FileInput_ReadFile("c:\tmp\test.sql") ' MyTxt = FileInput_ReadFile("c:\tmp\test.csv") ' ' Revision History: ' Rev Date(yyyy-mm-dd) Description ' ************************************************************************************** ' 1 2012-05-14 Initial Public Release ' 2 2021-09-25 Updated Header ' Updated Error Handler ' Code Cleanup ' 3 2025-04-29 Renamed ' Added bPerformBinaryRead to switch between Text and ' Binary reading '--------------------------------------------------------------------------------------- Function FileInput_ReadFile(ByVal sFile As String, _ Optional bPerformBinaryRead As Boolean = False) As String On Error GoTo Error_Handler Dim iFileNumber As Integer Dim sFileContent As String Start: If bPerformBinaryRead Then ' Binary Read -> Images, videos, audio, exe, ... ' *********************************************** iFileNumber = FreeFile Open sFile For Binary Access Read As iFileNumber sFileContent = Space(LOF(iFileNumber)) Get #iFileNumber, , sFileContent Else ' Text Read -> txt, csv, json, xml, dat, ini, ... ' *********************************************** iFileNumber = FreeFile Open sFile For Input As #iFileNumber sFileContent = Input(LOF(iFileNumber), #iFileNumber) End If FileInput_ReadFile = sFileContent Error_Handler_Exit: On Error Resume Next Close #iFileNumber Exit Function Error_Handler: If Err.Number = 55 Then '55 - File already open Close #iFileNumber GoTo Start ElseIf Err.Number = 62 Then ' 62 - Input past end of file => tried reading binary file as text!? Close #iFileNumber bPerformBinaryRead = True 'Let's switch it to Binary reading mode GoTo Start Else MsgBox "The following error has occured" & vbCrLf & vbCrLf & _ "Error Number: " & Err.Number & vbCrLf & _ "Error Source: FileInput_ReadFile" & vbCrLf & _ "Error Description: " & Err.Description & _ Switch(Erl = 0, "", Erl <> 0, vbCrLf & "Line No: " & Erl) _ , vbOKOnly + vbCritical, "An Error has Occured!" Resume Error_Handler_Exit End If End Function
FSO (objeto del sistema de archivos)
Codificación estándar
#If FSO_EarlyBind = True Then #Else 'OpenAsTextStream Constants Private Const ForReading = 1 'Open a file for reading only. You can't write to this file. Private Const ForWriting = 2 'Open a file for writing. If a file with the same name exists, its previous contents are overwritten. Private Const ForAppending = 8 'Open a file and write to the end of the file. Private Const TristateUseDefault = -2 'Opens the file by using the system default. Private Const TristateTrue = -1 'Opens the file as Unicode. Private Const TristateFalse = 0 'Opens the file as ASCII. Private Const TristateMixed = -2 '??????????? same value as TristateUseDefault ????????????? #End If Function FSO_File_ReadAll(ByVal sFile As String, Optional lTristate As Long = TristateUseDefault) As Variant On Error GoTo Error_Handler #If FSO_EarlyBind = True Then Dim oFSO As Scripting.FileSystemObject Dim oFSO_TS As Scripting.TextStream Set oFSO = New FileSystemObject #Else Dim oFSO As Object Dim oFSO_TS As Object Set oFSO = CreateObject("Scripting.FileSystemObject") #End If Set oFSO_TS = oFSO.GetFile(sFile).OpenAsTextStream(ForReading, lTristate) FSO_File_ReadAll = oFSO_TS.ReadAll Error_Handler_Exit: On Error Resume Next oFSO_TS.Close Set oFSO_TS = Nothing Set oFSO = Nothing Exit Function Error_Handler: If Err.Number = 53 Then 'File not found FSO_File_ReadAll = Null ElseIf Err.Number = 70 Then 'Permission denied - File locked/in use/... Else MsgBox "The following error has occurred" & vbCrLf & vbCrLf & _ "Error Source: FSO_File_ReadAll" & vbCrLf & _ "Error Number: " & Err.Number & vbCrLf & _ "Error Description: " & Err.Description & _ Switch(Erl = 0, "", Erl <> 0, vbCrLf & "Line No: " & Erl) _ , vbOKOnly + vbCritical, "An Error has occurred!" End If Resume Error_Handler_Exit End Function
Codificación de variable de objeto de autocuración (shov)
' Req'd Refs: Late Binding -> None required ' Early Binding -> Microsoft Scripting Runtime #Const FSO_EarlyBind = False #If FSO_EarlyBind = True Then Private pFSO As Scripting.FileSystemObject #Else Private pFSO As Object #End If ' Could be incorporate in the above! #If FSO_EarlyBind = True Then #Else 'OpenAsTextStream Constants Private Const ForReading = 1 'Open a file for reading only. You can't write to this file. Private Const ForWriting = 2 'Open a file for writing. If a file with the same name exists, its previous contents are overwritten. Private Const ForAppending = 8 'Open a file and write to the end of the file. Private Const TristateUseDefault = -2 'Opens the file by using the system default. Private Const TristateTrue = -1 'Opens the file as Unicode. Private Const TristateFalse = 0 'Opens the file as ASCII. Private Const TristateMixed = -2 '??????????? same value as TristateUseDefault ????????????? #End If #If FSO_EarlyBind = True Then Public Function oFSO() As Scripting.FileSystemObject #Else Public Function oFSO() As Object #End If If pFSO Is Nothing Then Debug.Print "************************* Setting oFSO ***" Beep Beep #If FSO_EarlyBind = True Then Set pFSO = New FileSystemObject #Else Set pFSO = CreateObject("Scripting.FileSystemObject") #End If End If Set oFSO = pFSO End Function Public Sub oFSO_Clear() 'Be sure to always run this when closing your Form/DB to avoid ' hidden instances from running in the background! Set pFSO = Nothing End Sub Function FSO_File_ReadAll_SHOV(ByVal sFile As String, Optional lTristate As Long = TristateUseDefault) As Variant On Error GoTo Error_Handler #If FSO_EarlyBind = True Then Dim oFSO_TS As Scripting.TextStream #Else Dim oFSO_TS As Object #End If Set oFSO_TS = oFSO.GetFile(sFile).OpenAsTextStream(ForReading, lTristate) FSO_File_ReadAll_SHOV = oFSO_TS.ReadAll Error_Handler_Exit: On Error Resume Next oFSO_TS.Close Set oFSO_TS = Nothing Exit Function Error_Handler: If Err.Number = 53 Then 'File not found FSO_File_ReadAll_SHOV = Null ElseIf Err.Number = 70 Then 'Permission denied - File locked/in use/... Else MsgBox "The following error has occurred" & vbCrLf & vbCrLf & _ "Error Source: FSO_File_ReadAll_SHOV" & vbCrLf & _ "Error Number: " & Err.Number & vbCrLf & _ "Error Description: " & Err.Description & _ Switch(Erl = 0, "", Erl <> 0, vbCrLf & "Line No: " & Erl) _ , vbOKOnly + vbCritical, "An Error has occurred!" End If Resume Error_Handler_Exit End Function
Pruebas
Lo que hice fue crear una rutina que recorriera una multitud de archivos (TXT, SQL, JSON, XML, INI, …) y usar mi función de temporizador para determinar el tiempo transcurrido para leer cada uno y pasarlo a una variable. Hice esto para la unión temprana, la unión tardía, la unión temprana, la unión tardía y luego compilé los valores y lo comparé todo.
Para los archivos de los pequeños, los resultados demostraron claramente que ADODB era el camino a seguir. El orden de rendimiento fue:
- ADODB Stream Buya de encuadernación temprana
- ADODB Stream Plav de encuadernación tardía
- ADODB Stream encuadernación tardía
- ADODB Stream encuadernación temprana
- Entrada de archivo
- FSO Shov temprana de encuadernación
- FSO Plava de encuadernación tardía
- FSO Binding Early
- FSO Bonda tardía
con el FSO siendo sustancialmente más lento.
Pero donde las cosas comenzaron a ponerse interesantes fue cuando realicé una ronda similar de pruebas, pero en archivos más grandes (2 MB, 7 MB, …), luego la historia pareció cambiar y ADODB Stream estaba funcionando lo peor, mientras que la entrada de archivos lideraba el paquete.
Bueno, creo que para la mayoría de los casos, la necesidad de leer archivos es leer archivos relativamente pequeños y, por lo tanto, uno de los enfoques de transmisión ADODB tiene más sentido.
Si, por otro lado, sabe que siempre está manejando archivos más grandes, entonces vaya con el enfoque de entrada del archivo.
Y si desea un buen desempeño, independientemente del tamaño del archivo, entonces, una vez más, ¡vaya con el enfoque de entrada del archivo!
¡Solo mantente alejado de FSO!
Incluso podría llegar tan lejos como crear una rutina simple que use Filelen () para determinar el tamaño del archivo y usar la transmisión ADODB cuando sea 1 MB o menos y la entrada de archivo para todo lo demás.