Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
419 views
in Technique[技术] by (71.8m points)

apache beam - Is a Source that has unknown but limited elements considered BoundSource or UnboudSource?

Is a Source that has unknown but limited elements considered BoundSource or UnboudSource?

If I would be able to implement both BoundSource and UnboudSource, which one is "better"? By "better" I mean which would offer more options or better performance?

I'm going to crawl a website that has pagination, so initially, I do not know how many pages will I crawl, however, I am sure that it's not infinite.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

BoundedSource will allow you to run the job as a batch job, which will likely be faster. The only reason to use UnboundedSource instead here would be if you wanted to process the data as it was read in, instead of all at once after it all has been read.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...