[whatwg] Drag-and-drop folders/files support with directory structure using DirectoryEntry

Thu Nov 17 14:33:50 PST 2011

On Wed, Nov 16, 2011 at 15:59, Eric U <ericu at google.com> wrote:

> On Wed, Nov 16, 2011 at 3:55 PM, Daniel Cheng <dcheng at chromium.org> wrote:
> > Let's say I drag my pictures directory to a web app uploader. If this
> > uploader passes the DirectoryEntry to my pictures directory to a worker,
> > will it be able to read files I create a long time after the original
> drag?
> > It sounds like the approach being advocated would allow that type of
> attack.
>
> I think it's a bit of an exaggeration to call that an "attack", but
> yes, we'll have to make sure we set expectations appropriately.
>

What do you mean by set expectations?

On Thu, Nov 17, 2011 at 11:12, Kinuko Yasuda <kinuko at chromium.org> wrote:

> On Fri, Nov 18, 2011 at 3:18 AM, Jonas Sicking <jonas at sicking.cc> wrote:
> > On Wed, Nov 16, 2011 at 7:09 AM, Kinuko Yasuda <kinuko at chromium.org>
> wrote:
> >> On Wed, Nov 16, 2011 at 5:42 PM, Jonas Sicking <jonas at sicking.cc>
> wrote:
> >>> On Tue, Nov 15, 2011 at 3:02 PM, Glenn Maynard <glenn at zewt.org> wrote:
> >>>> On Tue, Nov 15, 2011 at 5:21 PM, Jonas Sicking <jonas at sicking.cc>
> wrote:
> >>>>>
> >>>>> Adding FileEntry/DirectoryEntry seems confusing since those are
> >>>>> generally writable in the FileSystem API spec, right? Additionally,
> >>>>> DirectoryEntry is asynchronous, which makes enumerating the tree more
> >>>>> painful.
> >>>>>
> >>>>> The way we were planning on exposing this in Gecko is to simply set
> >>>>> File.name to the full relative path to the folder dropped. So in the
> >>>>> example above, if the user dropped the "Photos" folder from the
> >>>>> example above on a page, we'd make .files return a list of 7 Files,
> >>>>> with names like "Photos/trip/1.jpg", "Photos/trip/2.jpg",
> >>>>> "Photos/trip/3.jpg", "Photos/halloween/a.jpg", etc.
> >>>>
> >>>> That requires a full directory traversal in advance to find all of the
> >>>> files, though; the tree could be very large.  For example, a sharded
> >>>> directory tree containing hundreds of thousands of files with
> individual
> >>>> frames of a video isn't unheard of, and there's no need to read it
> all in
> >>>> advance.  Directory trees with tens of thousands of photos, audio
> clips,
> >>>> emails (Maildir), etc. aren't uncommon, either.
> >>>>
> >>>> DirectoryEntry's asynchronous API seems to have the same advantages
> here as
> >>>> they do for regular filesystem access.  It would also set the stage
> for
> >>>> exposing writable directories down the line (eg. drag an input and
> output
> >>>> directory for file processing), after the security issues are figured
> out.
> >>>
> >>> You need to do that anyway to implement the .files attribute, no?
> >>
> >> Yes, but even we provide the attribute today it wouldn't give the best
> >> user experience or could be broken with some likely scenarios.
> >>
> >> If we could think of better option I think we should make it available.
> >
> > I'm not sure I understand what you mean.
> >
> > As long as you support the .files property, you need to traverse all
> > files that are selected before firing the final drop event. Otherwise
> > you risk having to do synchronous IO if someone does access the .files
> > property.
> >
> > Though you could do what another email in this thread suggested and
> > not traverse subdirectories when populating .files. Is that what
> > you're planning to do?
>
> (I think I have confused you, sorry)
>
> We support folders in .files for <input type=file> but only when
> 'webkitdirectory' is specified, so we do not always do the traversal.
> We do not support folders in .files in dataTransfer either (and no
> plan to do so for now).
>
> > I'm still not convinced that providing an API which provides
> > asynchronous traversal of the files is going to lead to a better user
> > experience. In all scenarios that I can think of, the page which
> > received the drop is going to want to traverse the whole directory
> > tree anyway. For example in order to create a list of files which
> > contain images as to display previews of them. Or to store them in the
> > sandboxed FileSystemAPI or IndexedDB. Or to submit the files using XHR
> > to the server.
> >
> > So by providing references to just the root of directories we are not
> > in fact reducing IO, just shifting it from the UA doing it to the
> > webpage doing it. From a user's point of view that doesn't seem to be
> > an improvement.
>
> I don't think the actual amount of IO determines the user
> experience--- in most cases user feels frustrated when their action is
> not reflected immediately (e.g. due to blocking IO operations), or
> when user cannot control what they expected they would be able to.
>
> "shifting it from the UA doing it to the webpage doing it" means that
> we give more control to the webapps, and I'm in a belief that it would
> also give more possibilities.
>
> Say, webpages could show a consistent fancy file selection UI on any
> UA/platforms, and this could also lead to reduce the actual IO by
> letting the user refine what they dropped in.  Or webpages could build
> a cool file-browser type application (not only for upload) on any
> platforms.  Or they could show a nice progress meter with cancellation
> UI so that the user does not need to worry about the UA hunging up
> even if s/he mistakenly dropped a huge directory, say, '/' or a root
> of slow media.  I think we could call such scenarios an improvement--
> wdyt?
>
> > / Jonas
> >
>

I think the biggest advantage is a lot less jank, because the potentially
slow IO of traversing through the directory occurs asynchronously/in a
worker, rather than on the browser's main thread.

On Thu, Nov 17, 2011 at 12:28, Glenn Maynard <glenn at zewt.org> wrote:

>
> So much effort has been made to move towards fully asynchronous UI that
>
>> making this synchronous would be a major loss, and this also leads very
> nicely towards a read-write interface.
>

Can you give an example of how you envision the write portion working?

>  I think there's an even bigger problem with doing traversal in advance.
>
>> You can't wait until the drag completes (eg. the mouse is released) before
>
>> performing the traversal; it needs to be complete before the first
>
>> dragstart event can be fired, to fill in DataTransfer.  Even if the
>
>> traversal is lazy, any code which triggers the traversal will force the
> user to sit there holding down the mouse button, waiting for it to
> complete.
>

That being said, the user shouldn't have to hold down their mouse though.
The files attribute isn't available until the drop event fires anyway.
However, this is why the items attribute (which is available in
dragenter/dragover events as well) was added; some authors wanted to be
able to use the file's MIME type to determine how to handle the drag. I
guess directory support there would almost certainly be out of question.

Daniel