public class Reader extends Object implements PropertiesProvider, MetadataProvider
Reads data from input sources into instances of Document
. Instances of this class must be
used on one single thread exclusively. They must not be used concurrently.
Instances of this class are used to carry out read processes. A read process consists of one or more read calls, that is, invocations of the various read(…) Methods provided by this class. A read process which includes all the data for one logical document (which could consist of several data streams) is called a logically coherent read process.
It is common practice to read multiple data streams using one Reader
instance, for
example multiple single-page data streams and annotations that form one logical document. It is
highly recommended to use a separate Reader
instance for every logically coherent read
process.
The read process performed by an instance of Reader
is uniquely identified by the an
identification String called Read ID.
Listening for progress
Using addReaderListener(ReaderListener)
, interested parties can register
themselves to receive information on state changes.
Cancelation
Using getCanceler()
, an object can be obtained which allows to issue requests for
the cancelation of currently-running read operations. If the FormatReader
in use supports
cancelation, the format read process will be stopped at an appropriate time. Once cancelation has
been requested, any subsequent calls to one of the read(...)
methods will have no
effect.
Completion
Once all data for a Document
is read, users of this class will typically call
complete()
. This will inform registered listeners.
Subclassing
Although this class is not final, it is not intended for subclassing in integrations. Internal details may change in any release and without prior notice.
Modifier and Type | Field and Description |
---|---|
static int |
AFTER_LAST_PAGE
Constant to specify that new pages should be added after the target
Document 's
currently last page. |
static int |
BEFORE_FIRST_PAGE
Constant to specify that new pages should be added before the target
Document 's first
page. |
protected boolean |
complete |
Constructor and Description |
---|
Reader()
Create a new instance with a default, unique Read ID.
|
Reader(String readID)
Create a new instance with the given Read ID
|
Modifier and Type | Method and Description |
---|---|
void |
addReaderListener(ReaderListener listener)
Add the given
ReaderListener to the list of registered listeners. |
void |
clearSettings(Class<? extends ReaderSettings> c)
Restore settings of a given kind to their default values.
|
void |
complete()
Signal to the reader that the current read process is complete.
|
protected com.levigo.jadice.document.internal.read.ReadTask |
createTask(Provider<? extends InputStream,IOException> streamProvider) |
Canceler |
getCanceler()
Returns an object which allows to request cancelation of running read processes.
|
Document |
getDocument()
Retrieve the
Document instance to which subsequent read calls add their resulting
document data (PageSegment s etc.) |
Format |
getFormat()
Retrieve the
Format instance which governs how subsequent read calls interpret the data
they read. |
Map<DocumentLayer,DocumentLayer> |
getLayerMapping()
Retrieve the map which specifies where
PageSegment s resulting from subsequent read
calls should be placed within the virtual layers of a Page . |
Metadata |
getMetadata()
Returns the metadata, if any, to be used by subsequent read calls.
|
Map<String,Object> |
getProperties()
Return a map of user properties which can be used during the read process, in particular by
FormatReader s. |
String |
getReadID()
Retrieve an identifier for the read process that is performed using this
Reader
instance. |
<S extends ReaderSettings> |
getSettings(Class<S> c)
Retrieve settings of a given kind.
|
int |
getStreamIndex()
Return the index of the last stream read by this reader.
|
int |
getTargetIndex()
Get the target page index to be used for page segments read by subsequent read calls
|
protected void |
handleTask(com.levigo.jadice.document.internal.read.ReadTask readTask)
Not intended to be overridden in integrations! May change at any time and without prior notice.
|
boolean |
read(File file)
Transforms data from the given source into a jadice document model representation.
|
boolean |
read(InputStream is)
Transforms data from the given source into a jadice document model representation.
|
boolean |
read(Provider<? extends InputStream,IOException> streamProvider)
Transforms data from the given source into a jadice document model representation.
|
boolean |
read(URL url)
Transforms data from the given source into a jadice document model representation.
|
void |
removeReaderListener(ReaderListener listener)
Remove the given
ReaderListener from the list of registered listeners. |
void |
setDocument(Document document)
Replace this instance's target document to be used by subsequent read calls.
|
void |
setFormat(Format format)
Specify the
Format instance which governs how subsequent read calls interpret the data
they read. |
void |
setLayerMapping(Map<DocumentLayer,DocumentLayer> layerMapping)
Replace the entire layer mapping to be used by subsequent read calls.
|
void |
setMetadata(Metadata metadata)
Replace the entire metadata to be used by subsequent read calls.
|
void |
setReaderControls(ReaderControls rc)
Replace the existing
ReaderControls instance with all its ReaderSettings . |
void |
setTargetIndex(int targetIndex)
Set the target page index to be used for page segments read by subsequent read calls
|
public static final int AFTER_LAST_PAGE
Document
's
currently last page.public static final int BEFORE_FIRST_PAGE
Document
's first
page.protected boolean complete
public Reader()
public Reader(String readID)
readID
- A unique ID String which is used to identify the read process performed by the
newly created instance. The ID must be unique at run-time and with respect to all
other Read IDs. If this parameter is null
, a default ID will be
generated.public void clearSettings(Class<? extends ReaderSettings> c)
c
- the kind of settings to be restoredProcessingControls.clearSettings(Class)
public <S extends ReaderSettings> S getSettings(Class<S> c)
S
- the kind of settings this method call deals withc
- The kind of settings to be retrievedProcessingControls.getSettings(Class)
public void addReaderListener(ReaderListener listener)
ReaderListener
to the list of registered listeners.listener
- the listener to addpublic void removeReaderListener(ReaderListener listener)
ReaderListener
from the list of registered listeners.listener
- the listener to removepublic boolean read(URL url) throws JadiceException, IOException, IllegalArgumentException
url
- the source to be read.true
if the read was fully performed (either successfully or with errors);
false
if cancelation was requestedJadiceException
IOException
- if an I/O error occursIllegalArgumentException
- if the given URL
is nullIllegalStateException
- it an attempt is made to read with an instance that has already
been marked as complete()
.public boolean read(File file) throws JadiceException, IOException
file
- the source to read fromtrue
if the read was fully performed (either successfully or with errors);
false
if cancelation was requestedJadiceException
IOException
IllegalStateException
- it an attempt is made to read with an instance that has already
been marked as complete()
.public boolean read(InputStream is) throws JadiceException, IOException
is
- the source to read fromtrue
if the read was fully performed (either successfully or with errors);
false
if cancelation was requestedJadiceException
IOException
IllegalStateException
- it an attempt is made to read with an instance that has already
been marked as complete()
.public boolean read(Provider<? extends InputStream,IOException> streamProvider) throws JadiceException, IOException
streamProvider
- supplies the source to read fromtrue
if the read was fully performed (either successfully or with errors);
false
if cancelation was requestedJadiceException
IOException
IllegalStateException
- it an attempt is made to read with an instance that has already
been marked as complete()
.protected com.levigo.jadice.document.internal.read.ReadTask createTask(Provider<? extends InputStream,IOException> streamProvider)
protected void handleTask(com.levigo.jadice.document.internal.read.ReadTask readTask) throws IOException, JadiceException
IOException
JadiceException
public Document getDocument()
Document
instance to which subsequent read calls add their resulting
document data (PageSegment
s etc.)public void setDocument(Document document)
document
- the target document to be used from now ongetDocument()
public int getTargetIndex()
BEFORE_FIRST_PAGE
and
AFTER_LAST_PAGE
.BEFORE_FIRST_PAGE
,
AFTER_LAST_PAGE
public void setTargetIndex(int targetIndex)
targetIndex
- the zero-based target index or one of the special cases
BEFORE_FIRST_PAGE
and AFTER_LAST_PAGE
.BEFORE_FIRST_PAGE
,
AFTER_LAST_PAGE
public Metadata getMetadata()
getMetadata
in interface MetadataProvider
null
public void setMetadata(Metadata metadata)
metadata
- the metadata for this instance and/or the stream(s) to be read. May be
null
.public Map<String,Object> getProperties()
FormatReader
s. Whether or not any of these properties end up in the resulting
Document
's user properties depends on individual FormatReader
implementations.
Typically, integrations will not need to add properties to this map.
getProperties
in interface PropertiesProvider
Reader
's map of user propertiespublic Map<DocumentLayer,DocumentLayer> getLayerMapping()
PageSegment
s resulting from subsequent read
calls should be placed within the virtual layers of a Page
. Most formats would
typically place the PageSegment
s they read on the default layer
(DocumentLayer.DEFAULT
); annotations typically use the annotations layer
(DocumentLayer.ANNOTATIONS
. Using this map, PageSegment
s can be re-routed to
another (possibly integration-specific) layer.public void setLayerMapping(Map<DocumentLayer,DocumentLayer> layerMapping)
layerMapping
- the mapping to be applied during subsequent read callsgetLayerMapping()
public Format getFormat()
Format
instance which governs how subsequent read calls interpret the data
they read.Format
to be used by subsequent read calls, or null
in case of
auto-detection.public void setFormat(Format format)
Format
instance which governs how subsequent read calls interpret the data
they read.format
- the Format
to be used by subsequent read calls. Pass null
to
request format auto-detection.public void setReaderControls(ReaderControls rc)
ReaderControls
instance with all its ReaderSettings
.
Passing a null
-argument resets the internal controls instance to its initial
state.public String getReadID()
Reader
instance.public int getStreamIndex()
public void complete()
ReaderListener
s. Depending on whether or not the read process
was canceled, the event's ReaderListener.ReaderEvent.Type
is either ReaderListener.ReaderEvent.Type.READ_CANCELED
or
ReaderListener.ReaderEvent.Type.READ_COMPLETED
. Furthermore, on the Document
retrieved by a call to
getDocument()
, the document state is updated if it is still set to
Document.BasicState.LOADING
. Depending on whether or not the read process was canceled, the new
state is Document.BasicState.UNKNOWN
or Document.BasicState.READY
.
After calling this method, further calls to any of the read(...)
methods will
result in IllegalStateException
s.
Using this method is encouraged, but not mandatory.
public Canceler getCanceler()
Canceler
returned from this method is
thread-safe. It may be passed on to other threads and its methods may be invoked on other
threads.Copyright © 2024 levigo holding gmbh. All rights reserved.